Artificial intelligence could soon get a whole lot smarter, after Facebook said it’s open-sourcing fastText, a library of code for text representation and classification that it uses to power its bots.
The release comes as part of the Facebook AI Research (FAIR) Lab’s goal to help engineers and researchers by making its work available to the masses. The fastText library of code can be downloaded from Github, and requires a compiler with “good C++1 support.” In addition, FAIR has also published its research relating to fastText.
The best thing about fastText is its speed and efficiency. Facebook says that, as the name suggests, fastText is much faster than other machine-learning techniques, and is able to train models “on more than one billion words in less than 10 minutes using a standard multicore CPU.” Indeed, FAIR says that fastText is so fast that it could potentially cut training time from several days to just a few seconds.
FastText’s main focus is on classifying words and sentences in order to produce libraries that programs can use as a reference when they’re executing tasks. So, for example, fastText is able to learn that words such as “man,” “woman,” “boy” and “girl” refer to specific gendered nouns, and store those values in a document. Later, when an AI bot tries to interpret a request such as “where my girls at,” it can refer to the document and will understand the user is asking for female names.
In a blog post, FAIR said that sharing the fastText code would “ultimately help us all design better applications and further advances in language understanding.” Using fastText, developers will be able to build smarter bots, which are notoriously vulnerable to flaws yet nonetheless are growing in popularity on Facebook Messenger and other platforms.
Indeed, Facebook said last July there are more than 18,000 bots on Messenger, and that figure will likely grow significantly in future. (You can check out SiliconANGLE’s definitive list of Messenger chat bots here.)
“With the growing amount of online data, there is a need for more flexible tools to better understand the content of very large datasets, in order to provide more accurate classification results,” FAIR’s researchers said.