At the Thomas Jefferson National Accelerator Facility, data scientists and developers from the U.S. Department of Energy are testing the latest artificial intelligence (AI) techniques to make high-performance computing more reliable and cost-efficient.
Their focus? Training artificial neural networks to monitor and predict the performance of scientific computing clusters—massive systems where enormous amounts of data are constantly being processed.
The objective is clear: help system administrators detect and resolve problematic computing jobs faster, minimizing downtime for scientists who rely on these systems to analyze experimental data, Newswise reports.
But instead of a typical software deployment, this effort takes on the flair of a competition. These machine learning (ML) models are put through their paces in a head-to-head challenge to determine which one best adapts to the ever-changing demands of experimental datasets.
However, unlike America’s Next Top Model and its international spin-offs, this contest doesn’t take an entire season to declare a winner. Here, a new “champion model” is selected every 24 hours based on its ability to learn and adapt to the latest data.
“We’re trying to understand characteristics of our computing clusters that we haven’t seen before,” said Bryan Hess, scientific computing operations manager at Jefferson Lab and one of the study’s lead investigators. “It’s looking at the data center in a more holistic way, and going forward, that’s going to be some kind of AI or ML model.”
While these AI models may not be gracing magazine covers anytime soon, the project has gained recognition in the research community. It was recently featured in IEEE Software, a peer-reviewed scientific journal, as part of a special edition focused on machine learning applications in data center operations (MLOps).
AI Meets Big Science
Large-scale scientific instruments—such as particle accelerators, light sources, and radio telescopes—are essential for groundbreaking discoveries. Facilities like Jefferson Lab’s Continuous Electron Beam Accelerator Facility (CEBAF), a DOE Office of Science User Facility, serve a worldwide community of over 1,650 nuclear physicists.
At Jefferson Lab, experimental detectors capture subtle traces of tiny particles produced by CEBAF’s electron beams. Since the accelerator operates 24/7, it generates an enormous amount of data—tens of petabytes per year, enough to fill an average laptop’s hard drive every minute.
Processing these vast amounts of information requires high-throughput computing clusters, which run specialized software tailored to each experiment.
With so many complex jobs running simultaneously, failures are inevitable. Some computing tasks or hardware issues can cause anomalies, such as fragmented memory or input/output (I/O) bottlenecks, which can delay scientists’ ability to process and analyze their data.
“When compute clusters get bigger, it becomes tough for system administrators to keep track of all the components that might go bad,” explained Ahmed Hossam Mohammed, a postdoctoral researcher at Jefferson Lab and an investigator in the study. “We wanted to automate this process with a model that flashes a red light whenever something weird happens.”
“That way, system administrators can take action before conditions deteriorate even further.”
DIDACT: AI for Data Center Management
To address these challenges, the team developed a machine learning-based management system called DIDACT (Digital Data Center Twin). The name is a clever play on didactic, a term meaning something designed to teach—in this case, AI models learning about computing systems.
Funded by Jefferson Lab’s Laboratory Directed Research & Development (LDRD) program, DIDACT aims to detect system anomalies and pinpoint their causes using an AI technique called continual learning.
In continual learning, ML models are trained on incoming data in an incremental fashion—similar to how humans and animals learn over time. The DIDACT team applies this method by training multiple models, each capturing different system behaviors, and selecting the best-performing one based on the latest data.
These models are built using unsupervised neural networks known as autoencoders. One of them integrates a graph neural network (GNN), which examines relationships between various system components.
“They compete using known data to determine which had lower error,” said Diana McSpadden, a Jefferson Lab data scientist and lead on the MLOps study. “Whichever won that day would be the ‘daily champion.’”
This approach has the potential to significantly reduce downtime in data centers and optimize critical resources—leading to cost savings and improved efficiency in scientific research.
The Next Top AI Model
To train these models without disrupting daily computing operations, the DIDACT team developed a dedicated test environment known as the sandbox. Think of it as a proving ground where models are evaluated based on their learning capabilities.
The DIDACT system integrates open-source and custom-built software to develop and manage ML models, monitor the sandbox cluster, and log key data. All this information is displayed on an interactive dashboard for easy visualization.
The system operates through three main ML pipelines:
- Offline Development – A testing phase, akin to a dress rehearsal.
- Continual Learning – The real-time competition where models battle for top performance.
- Live Monitoring – The best-performing model becomes the main system monitor—until it’s dethroned by the next day’s champion.
“DIDACT represents a creative stitching together of hardware and open-source software,” said Hess, who is also the infrastructure architect for the High Performance Data Facility Hub being developed at Jefferson Lab in collaboration with DOE’s Lawrence Berkeley National Laboratory. “It’s a combination of things that you normally wouldn’t put together, and we’ve shown that it can work. It really draws on the strength of Jefferson Lab’s data science and computing operations expertise.”
Looking ahead, the DIDACT team plans to extend their research to explore how machine learning can optimize data center energy usage. This could involve reducing water consumption in cooling systems or adjusting processor activity based on real-time computing demands.
“The goal is always to provide more bang for the buck,” Hess said, “more science for the dollar.”
With AI-driven automation, the future of scientific computing is looking smarter, faster, and more efficient than ever.





medicament kamagra pharmacie en ligne en suisse medicament
acheter kamagra bon marche prix
buy cheap enclomiphene usa buying
online order enclomiphene cheap trusted
buy androxal us prices
online order androxal generic mexico
buy dutasteride in New York
cheapest buy dutasteride australia online no prescription
ordering flexeril cyclobenzaprine generic pharmacy in canada
how to buy flexeril cyclobenzaprine buy in london
cheap gabapentin buy germany
gabapentin without a perscription cheap
buying fildena usa online pharmacy
cheapest buy fildena price london
cheapest buy staxyn generic brand
cheap staxyn canada discount
ordering itraconazole generic germany
cheap itraconazole uk where buy
buy avodart cheap discount
how to order avodart cheap canada pharmacy
order xifaxan price at walmart
how to buy xifaxan buy online canada
buy rifaximin generic canada no prescription
get rifaximin cheap from usa
jak získat skutečný kamagra
kamagra singapore koupit
Registriere dich heute und erhalte 100% bis zu 1.000 € und 200 Freispiele Das Platincasino
hat sich bereits vom deutschen Markt zurückgezogen und daher kann es zu Problemen beim Öffnen der Webseite kommen. Das Platincasino hat sich bereits vom deutschen Markt
zurückgezogen. Falls du nicht mehr auf das Konto zugreifen kannst, um es
zu löschen, musst du den Kundensupport kontaktieren und dort die Löschung deines Kontos beantragen. Natürlich ist das
ärgerlich, allerdings brauchst du nicht selbstständig deinPlatincasino
Konto löschen. Die Betreiber der Online Spielothek haben sich aus Deutschland zurückgezogen und das Platincasino geschlossen.
Neben klassischen Einzahlungsboni locken regelmäßig Freispiele und wechselnde Aktionen, die auch
treuen Spielern zugutekommen. Bonus Bis Zu €2000 +
200 Freispiele! Platin Casino ist ein etabliertes Online-Casino, das Spielern aus Deutschland ein umfassendes und
sicheres Spielangebot bietet. Ihr Guthaben ist selbstverständlich sicher auf
Ihrem Spielerkonto hinterlegt und wird nicht angetastet.Bitte senden Sie uns
die fehlenden Unterlagen über denselben Kommunikationsweg, den Sie auch bisher genutzt haben. Hallo und
vielen Dank für Ihre ausführliche Schilderung.Wir möchten klarstellen, dass wir Ihre
bereits eingereichten Dokumente erhalten und erfolgreich verifiziert haben. Es tut uns leid zu hören, dass Sie diese Erfahrung gemacht
haben – so sollte ein Auszahlungsprozess selbstverständlich nicht ablaufen.Als ein lizenziertes und reguliertes Online-Casino dürfen wir Auszahlungen unserer Kunden keinesfalls
ohne einen gültigen Grund zurückhalten. Man gewinnt einmal groß, möchte sich den Gewinn
auszahlen lassen – und das Geld wird einfach nicht gutgeschrieben.Angeblich sei
der Betrag bereits ausgezahlt worden, obwohl man dem Support mehrere Nachweise (Kontoauszüge) vorgelegt hat.
References:
https://online-spielhallen.de/top-9-online-casinos-in-deutschland-2025-serios-getestet/