Census Logo

Census

Data and AI Engineer (Remote Worldwide)

Posted 7 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Greece
Senior level
Remote
Hiring Remotely in Greece
Senior level
Design and implement data pipelines, optimize integrations, support AI features, and ensure scalability and reliability of data solutions in cybersecurity.
The summary above was generated by AI

About CENSUS 

CENSUS LABS is a cybersecurity engineering powerhouse specializing in building secure and resilient systems. Our work is research-driven and engineering-focused, enabling us to deliver bespoke development and custom solutions at the intersection of cybersecurity and emerging technologies. By addressing complex product challenges, we help our partners evolve their platforms across domains such as secure communications, IoT, AI-powered systems, and enterprise applications. 

Learn more at CENSUS-Labs.com

We are seeking a Data & AI Engineer to design, implement, and optimize data pipelines and backend integrations that support next-generation cybersecurity and data-intensive platforms. The role is primarily data-centric, with emphasis on scalable pipeline design, enrichment and annotation workflows, schema modeling, and performance-driven analytics on large datasets. 

As part of cross-functional teams, you will also contribute to the integration of AI-enabled features (such as natural language querying, contextual enrichment, and intelligent analytics) where they intersect with data engineering. Our bespoke projects operate across the spectrum of data engineering, applied AI, and cybersecurity, addressing complex challenges such as large-scale security processing and autonomous decision engines. As part of this role you will have the opportunity to shape architectures that are secure, scalable, and intelligent. 

Key Responsibilities 

  • Data Pipelines & Processing 
  • Design and build high-performance data ingestion and processing pipelines using frameworks such as Spark and Iceberg 
  • Implement workflows for data correlation, enrichment, and annotation 
  • Ensure data quality, lineage, and reproducibility across ingestion and transformation stages 
  • Support analytics and reporting through platforms such as Superset, Presto/Trino, and Spark SQL 
  • Database & Storage Design 
  • Design and maintain schemas for large-scale structured and semi-structured datasets 
  • Optimize storage strategies for performance and cost (SQL/NoSQL, MPP, object storage, distributed file systems) 
  • Apply indexing, partitioning, and tuning techniques for efficient big data analytics 
  • AI/ML Integration 
  • Integrate open-source models and embeddings into data workflows where applicable (e.g., RAG pipelines, FAISS) 
  • Build services that enable natural language querying and contextual enrichment of datasets 
  • Collaborate on fine-tuning pipelines to support security-driven use cases 
  • Scalability & Reliability 
  • Ensure pipeline and storage solutions scale to handle high-volume data 
  • Implement monitoring, error handling, and resilience patterns in production 
  • Work with DevOps teams to containerize and deploy data services efficiently 
  • Collaboration & Delivery 
  • Translate requirements from product/security teams into robust data engineering solutions 
  • Contribute to PoCs, demos, and integration pilots 
  • Document solutions and participate in knowledge transfer with internal and partner teams 

Minimum Qualifications 

  • BSc/MSc in Computer Science, Data Engineering, or related field (or equivalent practical experience). 
  • 5+ years of experience in data engineering or big data analytics, with exposure to AI/ML integration. 
  • Strong proficiency in Python (Pandas, PySpark, FastAPI) and familiarity with Java/Scala for Spark 
  • Solid understanding of data pipeline design, schema modeling, and lifecycle management 
  • Experience with big data ecosystems: Apache Spark, Iceberg, Hive, Presto/Trino, Superset, or equivalents 
  • Hands-on experience with SQL (query optimization, tuning) and at least one NoSQL or distributed storage technology 
  • Practical experience building and deploying APIs and services in cloud or on-prem environments 
  • Strong problem-solving, debugging, and communication skills 
  • Proficient in English and excellent communication skills 

Preferred / Nice-to-Have Skills 

  • Experience with retrieval-augmented generation (RAG) pipelines and LLM-based applications 
  • Knowledge of security concepts and experience with network and software cybersecurity domains 
  • Experience with GPU acceleration, model optimization (quantization, distillation), and performance tuning 
  • Familiarity with containerization (Docker, Kubernetes) and DevOps workflows 
  • Exposure to BI / analytics platforms and visualization integration 

This role offers the opportunity to work on cutting-edge data and AI
initiatives in cybersecurity, contributing to solutions that address
real-world technology challenges. You will be part of a
multidisciplinary team that blends security research, data engineering,
and AI innovation, shaping systems that are not only secure but also
intelligent and future ready. 

Top Skills

Spark
Docker
Fastapi
Hive
Iceberg
Java
Kubernetes
NoSQL
Pandas
Presto
Pyspark
Python
Scala
SQL
Superset
Trino

Similar Jobs

4 Hours Ago
Remote
28 Locations
Senior level
Senior level
Security • Software • Cybersecurity • Automation
As a Senior Enterprise Solutions Architect, you'll ensure successful implementations of Drata's platform, providing technical expertise and enhancing customer satisfaction through collaboration and complex integrations.
Top Skills: AWSAzureBashGCPJavaScriptJSONNode.jsPythonReactRest ApisShellTypescriptUnix
18 Hours Ago
Remote
28 Locations
Entry level
Entry level
Machine Learning • Natural Language Processing
As a Dutch Expert Rater, you will review online ads to improve their relevance and usefulness, contributing to AI training and quality standards.
Top Skills: Ai SystemsOnline Ads
18 Hours Ago
Easy Apply
In-Office or Remote
32 Locations
Easy Apply
Mid level
Mid level
Cloud • Security • Software • Cybersecurity • Automation
As an Intermediate Site Reliability Engineer at GitLab, you will manage and automate infrastructure for production systems, respond to incidents, and collaborate with teams to enhance incident management.
Top Skills: AWSBashGCPGitlabGoKubernetesRubyTerraform

What you need to know about the Melbourne Tech Scene

Home to 650 biotech companies, 10 major research institutes and nine universities, Melbourne is among one of the top cities for biotech. In fact, some of the greatest medical advancements were conceptualized and developed here, including Symex Lab's "lab-on-a-chip" solution that monitors hormones to predict ovulation for conception, and Denteric's vaccine for periodontal gum disease. Yet, the thousands of people working in the city's healthtech sector are just getting started, to say nothing of the tech advancements across all other sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account