Search by job, company or skills

CSI Interfusion

Data Engineer Intern - Speech and Language Intelligence

Save
new job description bg glownew job description bg glow
  • Posted 23 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

This role is not just an internship.

It is an entry point into worldclass AI collaboration.

Your Impact & Responsibilities

As a Data Engineer Intern, you will operate as a handson contributor to our ASR data pipeline, not a passive assistant.

You Will

  • Engineer, preprocess, and qualityvalidate largescale speech and text datasets that directly influence ASR model performance
  • Design and execute data transformations including text normalization, data chunking, format conversion, and structured analysis
  • Optimize audio pipelines through segmentation, merging, transcoding, and subtitle/caption quality assurance
  • Strengthen data pipelines by improving robustness, traceability, and reproducibility through clean logs and documentation
  • Proactively identify data quality risks, triage issues at scale, and close the feedback loop with clarity and ownership

Your work feeds production speech models, not toy datasets.

Qualifications

We are looking for individuals who value engineering rigor, data quality, and longterm growth.

Required

  • Undergraduate or Masters student from a toptier university (Top 10 preferred) in Computer Science, Electrical Engineering, Statistics, Data Science, or related fields
  • Strong Python fundamentals, with the ability to write, debug, and improve dataprocessing scripts
  • High ownership mindset with exceptional attention to data quality, standards, and reproducibility
  • Able to commit to 6 months or longer to ensure meaningful technical depth and impact

Nice To Have

  • Exposure to Speech, ASR, or NLP through coursework or handson projects
  • Experience with speech/audio processing, data collection workflows, or multimedia QA (e.g., captions/subtitles)
  • Chinese language proficiency is a strong plus, enabling smoother collaboration with crossregional teams

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148286473