When it comes to machine learning (ML), the importance of feature engineering is often overlooked. Becoming a missed opportunity that many businesses simply pass by.
Which is an absolute loss for data scientists where accurate ML models and predictions are concerned.
Hence the need for a feature store. A library for your organisation’s curated data and an accessible, central location that allows you to store important features for ML models and their predictions.
Why is this valuable? Because it allows organisations to keep track of their data resources, ensuring that the most relevant and useful data is available for any ML projects. They also allow your data scientists to keep track of their data resources, ensuring that only the most relevant and useful data is available for use.
- What are feature stores?
- Does your organisation need one?
- And when should you start building one out?
In this three-part Machine Learning Feature Store series, we answer some of these pressing questions! Together with the help of our Machine Learning experts, Christiaan Viljoen and Dominic Kafka.
Firstly, What Is A Feature?
In feature engineering, a feature is a specific attribute of data that is useful for modelling. Either coming from a single raw data point, or an aggregation of points.
Features can be either numeric or categorical, and can be extracted from data in a number of ways; Either manually, or with automation tools. A process commonly known as ‘feature extraction’.
Kafka elaborates, “An example of a feature could be in the building of a model to predict fraudulent transactions; A relevant feature might be whether or not a person’s spending habits seem unusual. Or if they’ve made any purchases in a different country.”
“Features are usually built from aggregations. For example, sums, averages, minimums, maximums, etc. They can then inform and enable a ML model to have the ability to predict something,” adds Viljoen.
What Is a Feature Store?
A feature store is a centralised data management system that enables you to store, manage and distribute features to your ML models. Making it easy to access your features for analytics and ML projects, as well as help to ensure that your features are well-managed and of the highest quality.
Uber’s ML engineering team built and popularised feature stores back in 2017 when they introduced Michelangelo, a ML-as-a-Service (MaaS) platform, which made building, deploying and operating ML solutions a bearable process.
“We found great value in building a centralised Feature Store in which teams around Uber can create and manage canonical features to be used by their teams and shared with others… At the moment, we have approximately 10,000 features in the Feature Store that are used to accelerate machine learning projects, and teams across the company are adding new ones all the time. – __Uber__
Today you can find a wide range of competing feature stores, each with their own unique benefits and capabilities. Big tech brands like Google have one in Vertex AI and Amazon has one in SageMaker.
Ultimately, having one improves the accuracy of your models, saves time and increases productivity. It also reduces the amount of time that data scientists spend on discovering and calculating features that are within the same company.
“If you are a data scientist and you create features for a specific model, other data scientists could also potentially use those same features for their own models. If you’re dealing with massive, petabyte-sized datasets and you build a feature on something that takes several hours to create, there’s another guy using that same feature, doing the same thing,” says Viljoen.
Your Solution To Feature Engineering + An AMA With Our Experts!
Want to build your own feature store, but don’t know where to start?
Then come and join us! We’re hosting an AMA with Christiaan Viljoen and Dominic Kafka on all things ML and feature stores.
If you’re ready to learn about:
- How feature stores work,
- Whether they are right for your business, or
- The steps to build one out…
Then our experts are here to guide you!
It will be packed with exciting and exclusive industry tips that touch on everything from airlines to banks.
Don’t miss out! Register for this fantastic opportunity today:
[PSA: Keep your eyes peeled for the next two articles on our Machine Learning Feature Store series!]