What is synthetic data? And why should B2B marketers care?

Like so many next-big-things, the generative AI wave is towing a host of cottage industries in its wake. One of the most fascinating is the synthetic data industry.

I think it’s worth the attention of any B2B tech marketer because it reveals the complex challenges, opportunities, and risks of generative AI in microcosm – and because the best content about AI acknowledges and navigates that complexity.

Synthetic data: a solution to AI’s biggest obstacles

All AI models must be trained on extensive data. And the more general the task, the greater the variety and volume of data the model needs before it can respond with accuracy and confidence.

But collecting data volumes from the real world poses several issues:

  • Sourcing huge amounts of data is time-consuming and really expensive.
  • It can be hard to find data on uncommon or edge-case scenarios (think MRI scans of rare medical conditions or images of a machine experiencing a one-in-a-million fault).
  • There are privacy and copyright issues with using certain online datasets (such as data gleaned from social media platforms).
  • Data produced by humans can carry human biases.

Synthetic data promises a solution to many of these problems. Unlike conventional data used to train AI models, synthetic data is artificially generated, so it isn’t bound by the confines of reality.

For example, if you were training an AI to assess fuel efficiency across different commercial aircraft, you could use synthetic data generated by flight simulators instead of collecting real-world aircraft telemetry data from hundreds of flights.

By creating artificial data at scale, you can get more data at a lower cost without the copyright complications or biases of human-generated data. And you can also design datasets covering phenomena seldom seen in real life.

Synthetic data’s ability to remove all these roadblocks is so great that last summer, Gartner predicted 60% of data for AI will be synthetic by 2024.

The use cases unlocked by synthetic data

Computer vision models, which need training on large volumes of high-quality images, have been one of the first forms of AI to benefit from synthetic data. But there are many other use cases for synthetic data in its many forms, including:

  • Genomic data to train AI healthcare solutions on rare diseases – without breaching patient confidentiality.
  • Images of different (and potentially unreleased) products to train automatic defect recognition on manufacturing lines.
  • Financial records to develop fraud detection systems without using personal financial information.

Whatever task you want to train an AI model for, it’s likely that synthetic data can help make that process faster, more consistent, and cheaper.

The risk of AI eating itself

With so many use cases for synthetic data, there’s naturally a lot of demand. And one way to meet that demand is… with the help of generative AI. We’re already seeing some vendors working to build a closed loop for AI – where generative AI creates synthetic data that’s then fed into other AI models.

But this Ouroboros model of AI has its critics. When researcher Jathan Sadowski looked into the phenomenon, he found models that were “so heavily trained on the outputs of other generative AIs that [they] become an inbred mutant”.

A consumer-facing model spouting nonsense might, at worst, damage a brand’s reputation. But such degradation in a model designed to detect security risks for IT systems or cancerous cells in medical imaging could have catastrophic effects.

The implications for B2B tech companies and marketers

We’re still in the early days of this new generation of AI and the synthetic data that will support it. And with the major NASDAQ staples investing heavily in the space, any problems will have serious resources and talent thrown at them until they’re resolved.

So perhaps in the future, we will have something approaching a synthetic data utopia that leads to unfathomably powerful AI. But for now, we have a fork in the road that everyone in the B2B technology sector must navigate carefully.

Any story about synthetic data must be embraced with positivity and the hope that it will crack the code of training society-enhancing AI models. But we must also be ready to ask the most pressing questions about how synthetic data production can scale. And the level of scrutiny must be dialled up as generative AI and synthetic data training increasingly come into contact with critical, high-risk sectors like healthcare, education, and government.

More importantly, B2B tech marketers must be ready to openly discuss these challenges in any content that speaks about synthetic data and generative AI. Our audience is clever, connected, and very comfortable managing risk. They won’t be put off by an acknowledgment of the potential pitfalls and challenges in the field. In fact, they may find the honesty refreshing and ultimately trust the message and the brand behind it all the more.

Recommended further reading

If you want to learn more about synthetic data and AI, there are plenty of articles exploring this fast-growing field.

While it was written just before the recent AI renaissance, Forbes ran an article covering some of the major use cases for synthetic data and the earliest players in the industry. It’s a great place to start if you want a broad overview of the topic.

And for a clearer look at the potential risks associated with synthetic data, this interview with machine learning researchers Sina Alemohammad and Josue Casco-Rodriguez offers an expert outlook on what happens when AI consumes data created by other AI models.

Star power: Can nuclear fusion fuel the earth?

We spend a lot of time writing about the impact of global warming, from mitigating the risks of climate change to accelerating decarbonisation and renewable energy adoption. And if I’ve learnt one thing, it’s that if the world doesn’t speed up its decarbonisation efforts, humanity could be facing a desolate future.

Solar and wind power are both brilliant steps in the right direction, but when there’s no wind and the sun isn’t shining, we can’t use them to produce electricity. So, what are the alternatives?

Imagine if there was a way to power the world that was clean, carbon free, and possible whatever the weather.

The answer could be written in the stars.

These giant balls of plasma generate an abundance of energy through a process known as nuclear fusion. But is it a process we could ever recreate on earth?

We already have nuclear energy. So what is fusion?

Today, nuclear power plants use a process called nuclear fission to produce energy.

Nuclear fission uses unstable atomic isotopes (like uranium 235) and harnesses the energy they create as they decay. It’s highly efficient and doesn’t generate carbon dioxide. However, fission does create some pretty nasty waste products that can stay radioactive for millions of years.

Typically, power plants use geological disposal to handle this waste – burying radioactive material deep underground so thick layers of rock can stop radiation reaching the earth’s surface.

But if that doesn’t happen because of disaster or meltdown, it can be utterly devastating.

Instead of using elemental decay, nuclear fusion combines two isotopes of hydrogen: deuterium and tritium (which are abundant in water and lithium). This creates an atom of helium, a lone neutron, and a lot of energy.

In fact, fusion can generate nearly 4 million times more energy per kilogram of fuel than oil or coal, with no carbon emissions at all. There’s also no long-term radioactivity; only the beta-emitting ingredient tritium, which has a short half-life of just over 12 years. And there’s no risk of meltdowns, as fusion reactions can’t sustain themselves outside of a reactor.

It’s a lot safer than fission. But it’s also far more difficult to achieve.

Major developments are paving the way for fusion on earth

To make fusion reactions happen, scientists need to overcome deuterium and tritium’s natural electromagnetic repulsion. For that, they need to create a huge amount of heat and pressure.

Currently scientists are looking at two key methods to achieve this: magnets and lasers. And recently there have been major breakthroughs in both.

South Korea’s electromagnetic tokamak

South Korea’s “Artificial Sun” is a type of fusion reactor called a tokamak. It’s a donut shaped device that uses magnetic coils to create the intense conditions needed for nuclear fusion. These magnets produce a twisted magnetic field, causing deuterium and tritium atoms to collide and creating energy that heats the walls of the reactor. This heat can then convert water to steam which powers turbines and generates usable electricity.

In 2022, the Artificial Sun sustained a temperature of 100 million degrees Celsius for 30 seconds, and the team are aiming for 5 minutes by the end of 2026. It’s an unimaginable temperature. To put it into context, the centre of the Sun is only a puny 15 million degrees Celsius.

The lasers of America’s National Ignition Facility

In the US, the Lawrence Livermore National Laboratory has used lasers to achieve the first ever net energy gain from nuclear fusion. Physicists fired 192 lasers at a target chamber containing deuterium and tritium, causing a huge implosion of energy that forced the atoms to fuse and release energy.

To be useful to humanity, the energy produced needs to be greater than the energy put in. And the US team has now achieved this not just once, but four times.

Nuclear fusion could be the future of clean energy

Nuclear energy is gaining traction worldwide. It was formally specified as one of the solutions to climate change in the COP28 agreement, and many governments are now pledging more funding for nuclear research.

Current fusion science is a far cry from the cold fusion controversies of the 20th century, and every new development gets us closer to achieving a clean, carbon-free, and near-infinite energy source.

I’m fortunate enough to get to write about electrification and renewable energy in my work at Radix, and it’s so exciting to think that one day – albeit in a few decades – I might be writing about fusion energy in the same way.

If you’re a bit of a physics geek like me, and curious to learn more about nuclear fusion, the International Atomic Energy Agency is a great place to start.