Development - internTA

Prerequisites:

NVIDIA GPU with 8GB or more VRAM
Python environment
Git

Development Setup

1. Clone the Repository

git clone https://github.com/kongfoo-ai/internTA
cd internTA

2. Install Dependencies

pip install -r requirements.txt

Model Development

Data Generation

The first step in customizing InternTA is preparing the training data. We support two types of fine-tuning data:

Direct Q&A data
Guided Q&A data

Data Preparation Process

Generate Training Data

Model Fine-tuning

Verify Training Data

Check for the presence of training data:

ls -lh data/personal_assistant.json

Start Fine-tuning

Run the training script:

sh train.sh

This will use Xtuner to fine-tune the base InternLM2 model.

Check Training Progress

Monitor the training directory:

ls -lh train

Look for weight directories named pth_$NUM_EPOCH

Merge Model Weights

# Replace $NUM_EPOCH with your target epoch number
sh merge.sh $NUM_EPOCH

Testing and Evaluation

Interactive Testing

Test your model changes using the chat interface:

sh chat.sh

Automated Evaluation

Run the evaluation suite to measure model performance:

pytest ./test/test_model_evaluation.py

This will:

Generate responses for test cases
Calculate ROUGE similarity scores
Output results to test_results.csv

Development Tools

Xtuner

Tool for fine-tuning InternLM2 models

Streamlit

Framework for building the web interface

Troubleshooting

GPU Memory Issues

Data Generation Issues

Contributing

We welcome contributions to InternTA! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

For major changes, please open an issue first to discuss what you would like to change.

Support

If you need help during development:

Check the GitHub Issues
Review the API Documentation
Contact the development team at dev@kongfoo.cn

Get Started

​Development Setup

​1. Clone the Repository

​2. Install Dependencies

​Model Development

​Data Generation

​Model Fine-tuning

​Testing and Evaluation

​Interactive Testing

​Automated Evaluation

​Development Tools

Xtuner

Streamlit

​Troubleshooting

​Contributing

​Support

Development Setup

1. Clone the Repository

2. Install Dependencies

Model Development

Data Generation

Model Fine-tuning

Testing and Evaluation

Interactive Testing

Automated Evaluation

Development Tools

Troubleshooting

Contributing

Support