Maincode Logo

Maincode

Data Scientist

Posted 5 Days Ago
Be an Early Applicant
In-Office
Melbourne, Victoria
Entry level
In-Office
Melbourne, Victoria
Entry level
The Data Scientist will handle large-scale data to support AI model training, analyzing datasets and ensuring data quality and performance. Collaboration with engineers and researchers is key, along with designing data workflows, pipelines, and tools for dataset evaluation.
The summary above was generated by AI
Overview

Maincode is building Australian-made AI models from the ground up. We train foundation models from scratch, design new reasoning architectures, and deploy them on state-of-the-art GPU clusters. Our data and infrastructure are entirely homegrown, from curation to large-scale training, to ensure independence, transparency, and excellence in model performance.

We’re looking for a Data Scientist to work at the intersection of large-scale data, machine learning, and AI systems. You’ll help source, analyse, and shape the datasets that train next-generation models, working closely with engineers and researchers to make data the backbone of Australia’s AI capability.

This role suits someone with strong data science fundamentals who’s comfortable working with large datasets and curious about how data powers AI training. Experience in model training or deep learning is a plus but not required; we’re happy to teach and support the right candidate.


What you’ll do
  • Explore, process, and analyse massive and diverse datasets; from text and structured data to code and multimodal content.

  • Design and implement scalable data workflows for cleaning, transforming, and validating high-volume datasets.

  • Build and maintain data pipelines that prepare training-ready data for large-scale AI models.

  • Develop tools and metrics for assessing dataset quality, diversity, and performance impact.

  • Collaborate with AI Researchers to align datasets with evolving model architectures and training objectives.

  • Support continuous improvement of data ingestion, curation, and evaluation systems.

  • Contribute to open discussions on data quality, ethics, and responsible dataset creation.


Who you are
  • Experienced in Python and familiar with data processing frameworks (e.g., Pandas, PySpark, Dask, or Ray).

  • Strong background in data analysis, feature engineering, and statistical modeling.

  • Comfortable working with large datasets (multi-terabyte scale or distributed systems).

  • Understanding of data quality, validation, and reproducibility principles.

  • Interested in or curious about machine learning, deep learning, or AI training pipelines.

  • Pragmatic, hands-on, and excited to learn new systems, tools, and techniques.

  • Motivated to help build Australian-built AI capability and world-class data infrastructure.


Why Maincode

Maincode is a small, highly technical team working at the frontier of Australian AI. We build foundation models from scratch, not just fine-tune existing ones, and the data you work on will directly shape the behaviour of cutting-edge systems.

You’ll be surrounded by people who:

  • Care deeply about data quality and scientific rigour.

  • Build systems that scale cleanly and transparently.

  • Enjoy experimenting, learning, and shipping fast.

  • Want to see Australia lead in independent AI innovation.

Top Skills

Dask
Pandas
Pyspark
Python
Ray
HQ

Maincode Melbourne, Victoria, AUS Office

Melbourne, VIC, Australia, 3000

Similar Jobs

7 Days Ago
In-Office or Remote
3 Locations
Mid level
Mid level
Artificial Intelligence • Healthtech
As a Data Scientist (AI), you'll collaborate with AI engineers, handle data analytics, oversee model fine-tuning, deployment, and perform A/B testing to enhance healthcare solutions.
Top Skills: PythonPyTorchSQLTensorFlow
Yesterday
In-Office
Melbourne, Victoria, AUS
Senior level
Senior level
Robotics • Software • Automation
The Senior Data Scientist will lead the development of insights for a wellbeing platform, collaborating with care managers and building models to improve resident outcomes in aged care settings.
Top Skills: PythonSQL
20 Days Ago
Hybrid
4 Locations
Mid level
Mid level
Cloud • Fintech • Information Technology • Machine Learning • Software
As a Data Scientist, refine complex problems using scientific methods, collaborate with teams to enhance data-driven products, and maintain pipelines in the application of analytical data processing.
Top Skills: PythonRSQL

What you need to know about the Melbourne Tech Scene

Home to 650 biotech companies, 10 major research institutes and nine universities, Melbourne is among one of the top cities for biotech. In fact, some of the greatest medical advancements were conceptualized and developed here, including Symex Lab's "lab-on-a-chip" solution that monitors hormones to predict ovulation for conception, and Denteric's vaccine for periodontal gum disease. Yet, the thousands of people working in the city's healthtech sector are just getting started, to say nothing of the tech advancements across all other sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account