You are working under an assumption that this tech is an O(n) or better computat...

You are working under an assumption that this tech is an O(n) or better computational regime.

Ask ChatGPT: “Assume the perspective of an expert in CS and Deep Learning. What are the scaling characteristic (use LLMs and Transformer models if you need to be specific) of deep learning ? Expect answer in terms of Big O notation. Tabulate results in two rows, respectively “training” and “inference”. For columns, provide scaling characteristic for CPU, IO, Network, Disk Space, and time. ”

This should get you big Os for n being the size of input (i.e. context size). You can then ask for follow up with n being the model size.

Spoiler, the best scaling number in that entire estimate set is quadratic. Be “scared” when a breakthrough in model architecture and pipeline gets near linear.