AI NLP Engineer Jobs

Discover the latest remote and onsite AI NLP Engineer roles across top active AI companies. Updated hourly.

Check out 133 new AI NLP Engineer opportunities posted on The Homebase

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents, creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Create structured test cases that simulate complex human workflows. Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths. Work with code repositories and test frameworks to validate scenarios. Iterate on prompts, instructions, and test cases to improve clarity and difficulty. Ensure scenarios are production-ready, easy to run, and reusable.

$40 / hour
Undisclosed
HOUR

(USD)

Singapore
Maybe global
Remote

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents, including creating structured test cases that simulate complex human workflows. Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths. Work with code repositories and test frameworks to validate scenarios. Iterate on prompts, instructions, and test cases to improve clarity and difficulty. Ensure scenarios are production-ready, easy to run, and reusable.

$40 / hour
Undisclosed
HOUR

(USD)

Saudi Arabia
Maybe global
Remote

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents by creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Create structured test cases that simulate complex human workflows, define gold-standard behavior and scoring logic to evaluate agent actions, analyze agent logs, failure modes, and decision paths. Work with code repositories and test frameworks to validate scenarios, iterate on prompts, instructions, and test cases to improve clarity and difficulty. Ensure scenarios are production-ready, easy to run, and reusable.

$30 / hour
Undisclosed
HOUR

(USD)

Poland
Maybe global
Remote

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents by creating test cases that simulate human-performed tasks and defining gold-standard behavior for comparison against agent actions. Create structured test cases that simulate complex human workflows. Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths. Work with code repositories and test frameworks to validate scenarios. Iterate on prompts, instructions, and test cases to improve clarity and difficulty. Ensure scenarios are production-ready, easy to run, and reusable.

$30 / hour
Undisclosed
HOUR

(USD)

Portugal
Maybe global
Remote

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents by creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Create structured test cases that simulate complex human workflows. Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths. Work with code repositories and test frameworks to validate scenarios. Iterate on prompts, instructions, and test cases to improve clarity and difficulty. Ensure scenarios are production-ready, easy to run, and reusable.

$50 / hour
Undisclosed
HOUR

(USD)

Hungary
Maybe global
Remote

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents, creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Ensure each scenario is clearly defined, well-scored, easy to execute, and reusable. Create structured test cases simulating complex human workflows, define gold-standard behavior and scoring logic to evaluate agent actions, analyze agent logs, failure modes, and decision paths, work with code repositories and test frameworks to validate scenarios, iterate on prompts, instructions, and test cases to improve clarity and difficulty, and ensure scenarios are production-ready, easy to run, and reusable.

$24 / hour
Undisclosed
HOUR

(USD)

South Africa
Maybe global
Remote

Freelance AI Evaluation Scenario Writer

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents by creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Ensure each scenario is clearly defined, well-scored, easy to execute, and reusable. Create structured test cases simulating complex human workflows, define gold-standard behavior and scoring logic to evaluate agent actions, analyze agent logs, failure modes, and decision paths. Work with code repositories and test frameworks to validate scenarios, iterate on prompts, instructions, and test cases to improve clarity and difficulty, and ensure scenarios are production-ready, easy to run, and reusable.

$50 / hour
Undisclosed
HOUR

(USD)

Spain
Maybe global
Remote

Evaluation Scenario Writer - AI Agent Testing Specialist

New
Top rated
Mindrift
Part-time
Full-time
Posted

Design realistic and structured evaluation scenarios for LLM-based agents by creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Create structured test cases simulating complex human workflows, define gold-standard behavior and scoring logic to evaluate agent actions, analyze agent logs, failure modes, and decision paths, work with code repositories and test frameworks to validate scenarios, iterate on prompts, instructions, and test cases to improve clarity and difficulty, and ensure each scenario is production-ready, easy to run, and reusable.

$12 / hour
Undisclosed
HOUR

(USD)

Hyderabad, India
Maybe global
Remote

Freelance Legal Consultant - AI Trainer

New
Top rated
Mindrift
Part-time
Full-time
Posted

Collaborate on projects that improve generative AI models' abilities to handle specialized legal queries and complex reasoning. Generate training prompts, define evaluation criteria, and correct AI model responses based on your domain-specific legal knowledge.

Undisclosed
HOUR

(USD)

Maybe global
Remote Solely

Freelance Legal Consultant - AI Trainer

New
Top rated
Mindrift
Contractor
Full-time
Posted

The AI Legal Trainer generates prompts, defines scoring criteria, and corrects AI model responses in legal domains. They contribute domain expertise to help improve generative AI models’ capacity for specialized reasoning.

Undisclosed
HOUR

(USD)

Maybe global
Remote Solely

Want to see more AI NLP Engineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for AI NLP Engineer jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does an AI NLP Engineer do?","answer":"AI NLP Engineers develop, implement, and optimize natural language processing models and algorithms. They perform data preprocessing for NLP tasks and design applications such as text classification, sentiment analysis, named entity recognition, and machine translation. These specialists train models using metrics like BLEU and F1 scores, integrating them into products via APIs. They work with tools like spaCy, NLTK, and Hugging Face while implementing techniques including BERT, GPT, and Word2Vec. A typical day involves cleaning text data, fine-tuning models, debugging pipelines, and collaborating with data scientists to improve language processing capabilities across products."},{"question":"What skills are required for AI NLP Engineer jobs?","answer":"AI NLP Engineer roles require strong programming skills, particularly in Python, along with proficiency in NLP libraries like spaCy, NLTK, and Transformers. Candidates need expertise in machine learning frameworks such as TensorFlow, PyTorch, and scikit-learn. Essential skills include text preprocessing, feature extraction, model training, and evaluation using relevant metrics. Familiarity with advanced neural architectures like BERT, LSTMs, and transformer models is crucial. Data manipulation using pandas and NumPy helps when handling large datasets. Cloud deployment experience with AWS, Azure, or GCP is increasingly valuable, as is knowledge of containerization through Docker and Kubernetes for model deployment."},{"question":"What qualifications are needed for AI NLP Engineer jobs?","answer":"Most AI NLP Engineer positions require a Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Mathematics, or Computational Linguistics. Employers typically seek candidates with proven experience in NLP development roles and demonstrated understanding of core NLP principles like text representation techniques, semantic extraction, and entity recognition. Practical experience implementing machine learning models for language tasks is essential. Beyond formal education, a strong portfolio showing NLP projects, contributions to open-source libraries, or research publications can significantly strengthen applications. Some roles may require domain expertise in specific industries where language processing is being applied, such as healthcare, finance, or legal."},{"question":"What is the salary range for AI NLP Engineer jobs?","answer":"AI NLP Engineer salaries vary based on several key factors. Geographic location significantly impacts compensation, with tech hubs like San Francisco and New York offering higher wages. Experience level creates substantial differences—entry-level roles typically pay less than positions requiring 5+ years of specialized NLP experience. Education credentials (Bachelor's vs. Master's vs. PhD) affect starting offers. Industry sector matters too, with finance and healthcare often paying premium rates. Company size and funding stage influence both base salary and equity compensation. Specialized expertise in transformers, BERT, or GPT architectures generally commands higher compensation, as does experience with multilingual models or domain-specific applications."},{"question":"How long does it take to get hired as an AI NLP Engineer?","answer":"The hiring process for AI NLP Engineer positions typically spans 4-8 weeks. Initial resume screening takes 1-2 weeks as recruiters evaluate technical qualifications and experience. Technical assessments often include coding exercises focused on NLP tasks or take-home assignments to implement specific algorithms. Multiple interview rounds follow, usually including both technical discussions about NLP concepts and practical coding sessions. Many companies require candidates to present previous NLP projects or explain how they would approach specific language processing challenges. Specialized positions at larger tech companies might include additional system design interviews or discussions with multiple teams, potentially extending the timeline further."},{"question":"Are AI NLP Engineer jobs in demand?","answer":"AI NLP Engineer roles show strong demand across industries as companies implement language processing capabilities in their products and services. Organizations need specialists who can build text classification systems, develop chatbots, create sentiment analysis tools, and implement language translation features. The proliferation of language models like BERT and GPT has expanded applications, creating opportunities in healthcare (medical document processing), finance (market sentiment analysis), customer service (automated support), and media (content generation and moderation). This demand is evident in the frequency of NLP Engineer job listings on major platforms and the development of specialized recruitment templates for these positions."},{"question":"What is the difference between AI NLP Engineer and AI Prompt Engineer?","answer":"AI NLP Engineers focus on building and optimizing the underlying language processing models and systems, implementing algorithms for tasks like named entity recognition, sentiment analysis, and machine translation. They work directly with model architecture, training pipelines, and evaluation metrics. In contrast, AI Prompt Engineers specialize in crafting effective inputs (prompts) for existing large language models to produce desired outputs. Prompt Engineers focus on understanding model behavior and limitations, creating systematic instructions, and refining prompts to improve results without modifying the underlying model architecture. NLP Engineers need deeper technical expertise in machine learning, while Prompt Engineers require stronger understanding of language nuance and context manipulation."}]