If it touches upon machine learning, it’s part of Anne Schuth’s job. In only a few years, Anne has built a small but impressive machine learning empire within DPG Media. Here, you catch his personalized story.
In 2018 Anne joined as a Machine Learning Engineer, set up a team, and leveled up news personalization for all DPG Media’s news brands. Now, he’s proudly wearing his fifth job title: Machine Learning Architect. Anne laughs: “Yes, the fifth one already, but it’s a very logical sequence of job titles. They’ve evolved in line with my tasks and responsibilities.”
Anne’s steps within the company are illustrative of how things work at DPG Media. If you have a good idea and you convince the right people, you can make it happen. “It’s pretty cool that it’s not all mapped out from above. If you think of something of value, you can just plan a meeting with a director. I know that in a lot of companies, that is out of the question.”
Anne looks back on fruitful years in which the News Personalization squad grew from zero to twelve people. All of them work on machine learning applications for news recommendation. As a Machine Learning Architect, Anne focuses on the technical aspects of building a large-scale recommendation system and oversees the technical challenges of content understanding, search, user understanding, and ranking.
Before Anne’s arrival, DPG Media was only just, cautiously, exploring personalization. Anne: “I remember a Proof of Concept for Personalization as a Service, but it wasn’t quite right. It was more of an e-commerce solution which you see at Bol.com or Amazon. ‘People who have read this also read…’ That’s a very limited approach to personalization. Also, a huge downside was that such an approach only works for older content while we are serving readers news, inherently new.”
Time for a different approach. “Don’t focus on what other people are reading, but focus on the contents of an article,” says Anne. When he and his growing team started delivering their first products based on this rationale, personalization gained real traction within the company. “People started to see the effectiveness and slowly, but surely, started to realize that DPG Media is big enough to work on machine learning in-house. Also, I strongly believe that we, a publisher, shouldn’t outsource news recommendation as it’s such a fundamental part of our products. In the end, we select the news. Normally, the editorial staff would do that. If you leave it up to an algorithm, you have to understand that algorithm inside and out.”
The devil’s advocate and a moral compass
Anne thinks DPG Media is a stimulating company to work for if you’re into machine learning. The technical challenges are pretty thrilling, as the scale and impact are massive, and news recommendation is just fun to work on. And then there’s the moral appeal. Anne explains: “If we don’t get it right, companies like Google and Facebook will. They’ll sideline us. For society’s sake, I think that it’s important that a publisher takes on this role and not the foreign tech giants that have no sense of social responsibility.”
“Part of our job is explaining that we’re not creating filter bubbles. Personalization is nothing more than adjusting a product to a person”
The devil’s advocate will argue that society doesn’t need news personalization and is not interested in filter bubbles. Anne recognizes the resistance but says personalization doesn’t necessarily lead to filter bubbles. “Part of our job is explaining that we’re not creating filter bubbles. Personalization is nothing more than adjusting a product to a person, and yes, we could do that by creating a filter bubble. But we could also do the opposite and show people everything they don’t like. That’s also personalization, but it’s of no use to the user nor us.”
The real question is how to adjust a product so that it betters the user experience. For news, that means not exhausting readers by throwing the thousands of articles created daily by DPG Media’s news brands at them. One way or another, a manageable selection needs to be made, ensuring that a person is informed. And, unfortunately, it’s not that simple as a filter bubble. It’s always a mix of editorial selection and algorithmic personalization.
“One of our journalistic responsibilities is to inform people, and there are different ways to do so – no filter bubble needed. Let’s take me as an example. I’m not interested in soccer; I don’t want to read it, I don’t want to see it. So just don’t serve it. It’s as simple as that. It’s a very basic personalization option, one that not many people object to, but it is of high value as it opens up space on our news platforms for other content.”
“One of our journalistic responsibilities is to inform people, and there are different ways to do so – no filter bubble needed.”
Let’s talk tech: algorithms, tech stack, and more
But how does this machine learning thing actually work? Is the entire squad constantly writing and adjusting algorithms? Nope. The algorithms are at the heart of the systems the Machine Learning Engineers work on, but most time is spent on the systems themselves. The squad works with an advanced tech stack, including Python, Kubernetes, Redis, Elasticsearch, Airflow, Delta, MLflow. Kafka is used for stream processing and to connect all the dots in the infrastructure, and PySpark for batch processing. Everything runs on AWS.
One of the products Anne’s squad recently delivered is the personalized push notifications for Algemeen Dagblad and its regional titles. “The notifications are mainly location-based. Each day, we send hundreds of thousands of notifications. For every single published article, we basically consider each user individually. That’s an enormous load we have to deal with intelligently.”