# Migrating Data in Django: A Guide to Using RunPython in Migration Files

While working on your project, you will have to add new fields to your database. When you add a new field to the database, what values should the existing rows get for this field? You may want it to be null or you may want to populate it with correct data.

How can you populate existing fields? You can write a script and run this script from the shell after running the migrations. Seems nice, but if you have multiple environments such as staging, production, etc. then you have to run this script manually everywhere.

It would be awesome if somehow we could run such a script in our migration file itself. Django offers us a special operation called [RunPython](https://docs.djangoproject.com/en/3.2/ref/migration-operations/#runpython) which will help us achieve the same. This will enable us to run python scripts as a migration operation where we perform our data population or migrations.

You could also create an empty migration file using the `python manage.py makemigrations --empty yourappname` command and add a `RunPython` operation to run a script.

Let's jump into two examples that will provide much more clarity on this topic (Don't miss the second example).

### Example 1: Populating existing records with data

We have an `Employee` model with fields `name` and `hire_date`. There are 5 records in the table.

```plaintext
class Employee(models.Models):

    name = models.CharField(max_length=64)
    hire_date = models.DateField()
```

![existing employee records](https://cdn.hashnode.com/res/hashnode/image/upload/v1620739661054/efO7xYQvx.png align="left")

We want to add a third field `employee_id` to store the Employee ID of the employee. As per the company policies, employee ID should have the format `EMP<hiring_year>-<incrementing_number>`

```plaintext
class Employee(models.Model):

    name = models.CharField(max_length=64)
    hire_date = models.DateField()
    employee_id = models.CharField(max_length=64, null=True, blank=True)
```

So, after adding `employee_id` in the `Employee` model, create the migration file using the command `python manage.py makemigrations`.

The migration file will have only the `migrations.AddField` operation for adding the `employee_id`. The `migrations.RunPython` operation can be added after the `migrations.AddField` operation to add the employee IDs to existing employees.

```plaintext
# Generated by Django 3.1.7 on 2021-05-11 13:29

from django.db import migrations, models


def add_employee_id_to_existing_employees(apps, schema_editor):
    Employee = apps.get_model('shop', 'Employee')
    employees = Employee.objects.all().order_by('id')
    for i, employee in enumerate(employees):
        employee_id = f'EMP{employee.hire_date.year}-{i + 1}'
        employee.employee_id = employee_id
        employee.save()


class Migration(migrations.Migration):

    dependencies = [
        ('shop', '0005_employee'),
    ]

    operations = [
        migrations.AddField(
            model_name='employee',
            name='employee_id',
            field=models.CharField(blank=True, max_length=64, null=True),
        ),
        migrations.RunPython(
            add_employee_id_to_existing_employees,
            migrations.RunPython.noop
        )
    ]
```

After we run migrate this file using the `python manage.py migrate` command, the employee IDs for the existing employees will be created.

![Screenshot from 2021-05-11 19-07-13.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1620740253687/9L_96WomD.png align="left")

### Example 2: Comma-separated tags to m2m tags

We have a model `Book` for storing books with fields `name` and `tags`. The `tags` field is currently a `CharField` which stores tags related to the book in a comma-separated string. We have 5 books in the table.

```plaintext
class Book(models.Model):

    name = models.CharField(max_length=64)
    tags = models.CharField(max_length=256)
```

![Screenshot from 2021-05-11 19-16-38.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1620740810069/WtF6HQnsZ.png align="left")

The comma-separated tags approach looks bad, right?

So we create a `BookTag` model where all the tags will be and add `tags` as a `ManyToManyField` in our `Book` model.

```plaintext
class BookTag(models.Model):

    name = models.CharField(max_length=64)


class Book(models.Model):

    name = models.CharField(max_length=64)
    # tags = models.CharField(max_length=256)
    tags = models.ManyToManyField(BookTag)
```

When we create the migration file for this, Django created 3 operations:

* `migrations.CreateModel` for the `BookTag` model.
    
* `migrations.RemoveField` for the tags (CharField).
    
* `migrations.AddField` for adding `BookTag` as a `ManyToManyField`.
    

```plaintext
# Generated by Django 3.1.7 on 2021-05-11 13:48

from django.db import migrations, models


class Migration(migrations.Migration):

    dependencies = [
        ('shop', '0007_book'),
    ]

    operations = [
        migrations.CreateModel(
            name='BookTag',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=64)),
            ],
        ),
        migrations.RemoveField(
            model_name='book',
            name='tags',
        ),
        migrations.AddField(
            model_name='book',
            name='tags',
            field=models.ManyToManyField(to='shop.BookTag'),
        ),
    ]
```

If we migrate this file, all our existing comma-separated tags data will be lost since Django will run the `RemoveField` operation on our comma-separated tags column.

We need to move the data properly from the old `tags` column to the new `tags` column before removing the old one.

We could run the operations in a certain way as illustrated below.

![sdasdas.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1620749561685/7ibYddy_S.png align="left")

```plaintext
# Generated by Django 3.1.7 on 2021-05-11 13:48

from django.db import migrations, models

def create_tags(apps, schema_editor):
    Book = apps.get_model('shop', 'Book')
    BookTag = apps.get_model('shop', 'BookTag')

    for book in Book.objects.all():
        print(f'For book {book.name} ...')
        old_book_tags = book.old_tags.split(',')
        for old_book_tag in old_book_tags:
            try:
                new_tag = BookTag.objects.get(name=old_book_tag)
                print(f'\tTag {new_tag.name} already present ...')
            except BookTag.DoesNotExist:
                new_tag = BookTag.objects.create(name=old_book_tag)
                print(f'\tTag {new_tag.name} created ...')
            book.tags.add(new_tag)
            print(f'\tTag {new_tag.name} added to {book.name}')

class Migration(migrations.Migration):

    dependencies = [
        ('shop', '0007_book'),
    ]

    operations = [
        migrations.CreateModel(
            name='BookTag',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('name', models.CharField(max_length=64)),
            ],
        ),

        migrations.RenameField(
            model_name='book',
            old_name='tags',
            new_name='old_tags',
        ),


        migrations.AddField(
            model_name='book',
            name='tags',
            field=models.ManyToManyField(to='shop.BookTag'),
        ),

        migrations.RunPython(
            create_tags,
            migrations.RunPython.noop
        ),

        migrations.RemoveField(
            model_name='book',
            name='old_tags',
        )
    ]
```

I had added print statements in the scripts, so while migrating we can see the script got executed.

![Screenshot from 2021-05-11 19-34-55.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1620741909957/fCTSUQYIo.png align="left")

We can check the books and their tags which are in a ManyToManyField tags field.

![Screenshot from 2021-05-11 19-36-20.png](https://cdn.hashnode.com/res/hashnode/image/upload/v1620741988890/JhNhGLzSG.png align="left")

Tada! Easy, right? That's how you can migrate without worrying and keep your data clean.

Thanks for reading!
