infnorm's comments

infnorm · on Nov 18, 2017

Not at this time. However, in principle it would be possible to create such bindings.

infnorm · on Nov 18, 2017

Quantization comes in many different forms. TensorFlow lite provides optimized kernels for 8-bit uint quantization. This specific form of evaluation is not directly supported in TensorFlow right now (though it can train such a model). We will be releasing training scripts that show how to setup such models for evaluation.

infnorm · on Nov 18, 2017

This should be possible, but we haven't tried it. We're likely going to add a simplified target that has minimal dependencies (like no Eigen) that allows building on simple platforms.

dangjc · on Nov 18, 2017

Cool. I have something else that uses Eigen in WebAssembly, so that hasn't caused any issues btw.

infnorm · on Nov 17, 2017

The quantization is done with a special training script that is quantization aware. We will be open sourcing a mobilenet quantized training script to show how to do this soon.

infnorm · on Nov 14, 2017

We developed TensorFlow lite to be small enough to target really small devices that lack MMU’s like the ARM Cortex M MCU series, but we haven’t done the actual work to target those devices. That being said, we are excited when the ecosystem and community around machine learning expands.

ianhowson · on Nov 15, 2017

Cortex-M compatibility was literally my first thought when I read this -- especially low-memory systems. Might have to hack it up myself.

infnorm · on Nov 14, 2017

The main TensorFlow interpreter provides a lot of functionality for larger machines like servers (e.g. Desktop GPU support and distributed support). Of course, TensorFlow lite does run on standard PCs and servers, so using it on non-mobile/small devices is possible. If you wanted to create a very small microservice, TensorFlow lite would likely work, and we’d love to hear about your experiences, if you try this.

barbolo · on Nov 15, 2017

Thanks for the answer. Currently I’m using AWS Lambda to deploy my TensorFlow models. But it’s pretty hard and hacky. I need to remove a considerable portion of the code base that is not needed for inference only routines. I do that so the code loads faster and to fit the deployment package size limit. If TensorFlow Lite is already a compressed code, then it may be much easier to deploy it to a serverless environment. I’ll be trying it in my next deployments.

infnorm · on Nov 18, 2017

Sounds really interested. We're excited to hear about how that goes.

infnorm · on Nov 14, 2017

TensorFlow Lite is an interpreter in contrast with XLA which is a compiler. The advantage of TensorFlow lite is that a single interpreter can handle several models rather than needing specialized code for each model and each target platform. TensorFlow Lite’s core kernels have also been hand-optimized for common machine learning patterns. The advantage of compiler approaches is fusing many operations to reduce memory bandwidth (and thus speed). TensorFlow lite fuses many common patterns in the TensorFlow converter. We are of course excited about the possibility of using JIT techniques and using XLA technology within the TensorFlow Lite interpreter or as part of the TensorFlow Lite converter as a possible future direction.