Prerequisites:

  • NVIDIA GPU with 8GB or more VRAM
  • Python environment
  • Git

Development Setup

1. Clone the Repository

git clone https://github.com/kongfoo-ai/internTA
cd internTA

2. Install Dependencies

pip install -r requirements.txt

Model Development

Data Generation

The first step in customizing InternTA is preparing the training data. We support two types of fine-tuning data:

  • Direct Q&A data
  • Guided Q&A data

Model Fine-tuning

1

Verify Training Data

Check for the presence of training data:

ls -lh data/personal_assistant.json
2

Start Fine-tuning

Run the training script:

sh train.sh

This will use Xtuner to fine-tune the base InternLM2 model.

3

Check Training Progress

Monitor the training directory:

ls -lh train

Look for weight directories named pth_$NUM_EPOCH

4

Merge Model Weights

# Replace $NUM_EPOCH with your target epoch number
sh merge.sh $NUM_EPOCH

Testing and Evaluation

Interactive Testing

Test your model changes using the chat interface:

sh chat.sh

Automated Evaluation

Run the evaluation suite to measure model performance:

pytest ./test/test_model_evaluation.py

This will:

  1. Generate responses for test cases
  2. Calculate ROUGE similarity scores
  3. Output results to test_results.csv

Development Tools

Troubleshooting

Contributing

We welcome contributions to InternTA! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

For major changes, please open an issue first to discuss what you would like to change.

Support

If you need help during development:

  1. Check the GitHub Issues
  2. Review the API Documentation
  3. Contact the development team at dev@kongfoo.cn