Hugging Face (HF for short) completed a $100 million Series C fundraising, and the valuation reached $2 billion. I have been following HF for a while. Below is my understanding.

GitHub for models

HF started from the PyTorch version of the Bert open-source model. HF has always wanted to be GitHub for models.

The user experience is more like a model zoo(model marketplace). Some people upload their models and share them with the public. Most people download models from the marketplace. I would say it’s more like SourceForge rather than GitHub. I think the problem is, for most people, it is difficult to do further development after forking the models.

This was why some people argued about the possibility of commercializing the model community.

GitHub for pipelines/data apps

Now HF has started building " Spaces " GitHub for pipelines/data apps.

Unlike models, users can easily modify the pipelines/data app after forking them. There are some players already in this track, such as Google Colab, Streamlit (acquired by Snowflake), 8080 labs (acquired by Databricks), and some other Notebook as a Service vendors.

HF also acquired Gradio to enhance the Spaces product. (The deal details were not disclosed.) Spaces product is one major step to HF. Will it be a significant step for the industry as well? We should see in the next two years.

BigScience model

HF trains the BigScience big model (176 billion parameters), targeting OpenAI’s GPT-3.

However, Meta just took the lead in open-sourcing the OPT big model, which also targets GPT-3. But I don’t think this would have any negative impact on HF.

HF plans to provide more complete AI capabilities through BigScience + Spaces. If they succeed, it will be more competitive than the services offered by OpenAI.

I think the top uncertainty for HF is whether Spaces can succeed. It’s not only about whether HF could do better than competitors but also whether people need a GitHub for pipelines/data apps.