# Course: Migrating content from WordPress to Sanity
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity

Compose a powerful, reusable migration script to convert live data into Studio-ready structured content with references and assets. This module focuses on WordPress' REST API but could be adapted to any data source.

---

## Navigation

**Track:** [Replatforming from a legacy CMS to a Content Operation System](https://www.sanity.io/learn/track/replatforming-to-sanity) · [View as markdown](https://www.sanity.io/learn/track/replatforming-to-sanity.md)

## Contents

1. [Introduction](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/introduction-to-wp-migration) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/introduction-to-wp-migration.md)
2. [Find your WordPress API](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/first-steps) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/first-steps.md)
3. [Preparing a Studio and schema types](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/preparing-a-studio-and-schema-types) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/preparing-a-studio-and-schema-types.md)
4. [Preparing your migration script](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/preparing-your-cli-script) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/preparing-your-cli-script.md)
5. [Processing post types](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/support-importing-many-post-types) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/support-importing-many-post-types.md)
6. [Creating complete documents](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/creating-complete-documents) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/creating-complete-documents.md)
7. [Uploading assets performantly](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/uploading-assets-performantly) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/uploading-assets-performantly.md)
8. [Converting HTML to Portable Text](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-html-to-portable-text) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-html-to-portable-text.md)
9. [Converting WordPress blocks to Portable Text](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-wordpress-blocks-to-portable-text) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-wordpress-blocks-to-portable-text.md)
10. [Restructuring content](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/restructuring-content) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/restructuring-content.md)
11. [Conclusion](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/whats-next) · [markdown](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/whats-next.md)

---

## Lesson 1: Introduction
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/introduction-to-wp-migration

Unlock the power of scripting content migrations into Sanity, fix past platform mistakes, and confidently handle unique content structures. Import users, posts, pages, categories, tags, and assets, and convert HTML markup into Portable Text.

When WordPress was launched in 2003, it was primarily a blogging platform. Over the years, however, it has been used as a CMS, powering over 40% of all websites (according to some statistics).



While WordPress has been a great solution for many (in fact, both authors of this course have shipped many WP sites throughout the years), it does have limitations:



- The content is tightly coupled with the front-end presentation layer, making it harder to reuse and adapt

- Extending and customizing can get complex and brittle, causing downtime

- The rich text editor stores content as HTML, making it hard to reuse

- Performance and scaling can be challenging, especially for larger sites and teams

- Security patches and updates can be a never-ending task that potentially breaks your site in interesting (and costly) ways


This course will take you through moving your content out of a realistic WordPress installation and into Sanity. With that being said, in our experience, there is a lot of variety in how WordPress sites have been set up and how the content is stored, like:



- Custom fields and post types added by plugins or themes

- Content stored in the WordPress database in unexpected ways

- Multilingual content using various plugins and approaches

- Media and files are stored in multiple places, not always as WordPress attachments

- Inconsistent use of Gutenberg blocks, the Classic editor, or page builder plugins


But worry not! In this course, you’ll learn how to add structure to your WordPress content and bring it into Sanity by building schema types, import scripts, and migration jobs. 



Many of these techniques will be directly transferable for re-platforming from other HTML-based CMSes like Drupal, Adobe Experience Manager, SiteCore, etc.



> [!TIP]
> The lessons in this course are WordPress-specific implementations of the principles outlined in [Refactoring content for migration](https://www.sanity.io/learn/course/refactoring-content).



Now, roll up your sleeves and get ready!



## Additional resources



If you're not quite ready to start migrating *today* we have additional resources you may find useful to compare Sanity and WordPress to better inform your migration journey.



- [Comparing WordPress to Sanity](https://www.sanity.io/sanity-vs-wordpress?ref=learn-wordpress), a look at the differences between Sanity, a Content Operating System and WordPress, a blogging platform.

- [Contact us to discuss a large-scale migration](https://www.sanity.io/contact/sales?ref=learn-wordpress), while this course covers the basics and is applicable for most WordPress installations, your use case may be of a size and complexity that needs more attention. Contact us for more details.


## Prerequisites



To complete this course, you will need the following:



- A free Sanity account to create new projects and initialize a new Sanity Studio. If you do not yet have an account, running any `npx sanity` command will prompt you to create one.

- Some familiarity with running commands from the terminal. Wes Bos' [Command Line Power User](https://commandlinepoweruser.com/) video course is free and can get you up to speed with the basics.

- [Node and npm installed](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm) (or [an npm-compatible JavaScript runtime](https://developer.mozilla.org/en-US/docs/Learn/Tools_and_testing/Understanding_client-side_tools/Package_management#what_exactly_is_a_package_manager)) to install and run the Sanity Studio development server locally.

- Some familiarity with JavaScript. The code examples in this course can all be copied and pasted and are written in TypeScript, but you will not need advanced knowledge of TypeScript to take advantage.

> [!TIP]
> See [Installation](https://www.sanity.io/docs/installation) for more options when starting new Sanity projects


- [ ] Get up and running with all the prerequisites


If you're stuck or have feedback on the lessons here on Sanity Learn, [join the Community Slack](https://slack.sanity.io/) or use the Feedback form at the bottom of every lesson.



## Can't this migration just be automated?



It is sometimes tempting to release one-size-fits-nobody `wordpress-to-sanity` import tooling. But the reason that you want to get out of WordPress is probably partly how it doesn't let you structure and reuse content. It's important to see a content migration as an opportunity to **fix** the mistakes of your current platform, not to recreate them in a new environment.



This course intends to demonstrate the power of scripting content migrations into Sanity and free you from the presentational thinking that dominates website-specific CMSes—ushering you into a world of structured content.







While WordPress' conventions make most of the source data easy to target, the shape of your content in Sanity may be unique to your organization's use case. 



Your WordPress installation likely also has plugins that augment the content in some way that we cannot account for, but you should be able to handle it once you complete this course.



### Getting your team onboard is also part of the migration



While this course focuses explicitly on the technicalities of moving your content over from WordPress, we must stress that bringing your team over is equally important. This goes for your engineering colleagues, your content team, and/or clients. Moving from a CMS to a Content Operating System like Sanity will also change how you work with content (for the better).



We recommend that your team take the [Re-platforming to Sanity](https://www.sanity.io/learn/course/re-platforming-to-sanity) course—it can also be run as a workshop! Approaching a content migration project cross-functionally, will help you uncover those "unknown unknowns" more quickly, making it easier for everyone to get up to speed.



## Scope of this course



After this course, you will have imported users, posts, pages, categories, and tags into Sanity, created references between them, and uploaded assets. You will also have the skills to import content types like menus and comments.



You'll convert either post-processed or pre-processed HTML markup into Portable Text from the Classic or Block editor.



This course does not handle augmented content that may be modified by page builder (Elementor, Divi), SEO, or e-commerce plugins; however, anything accessible from the REST API can be added to your final import script.



---

## Lesson 2: Find your WordPress API
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/first-steps

Find and test your WordPress installations built in REST API to retrieve content.

The first step is assessing how you export content from your WordPress project. There are several ways to extract content from WordPress. In this course, you'll target the REST API. 



## Accessing the WordPress REST API



The WordPress REST API is a core feature of WordPress and a predictable way to query public and private content directly from the source.



> [!TIP]
> See the [WordPress REST API handbook documentation](https://developer.wordpress.org/rest-api/).



If you prefer to use the [XML export](https://wordpress.com/support/export/) tooling or [WPGraphQL](https://www.wpgraphql.com/), you must modify the logic of the scripts written in this course.



Before proceeding, you must ensure that your current install allows access to the REST API. You may have a security plugin or other configuration that blocks access. A simple way to get around this is to run WordPress locally with a copy of your production database.



Typically, you can visit the REST API at the following route:



```
https://<your-domain-name>/wp-json/wp/v2
```

You should see a JSON response in your web browser, with the available routes to query content from:



![Image](https://cdn.sanity.io/images/3do82whm/next/f3639956069e8047ab26aeab9741664c66d4999f-2144x1388.png)

> [!TIP]
> The above is viewed in a Chromium-powered web browser with the [JSON Formatter extension](https://chromewebstore.google.com/detail/json-formatter/bcjindcccaagfpapjjmafapmmgkkhgoa?hl=en&pli=1) installed



To compare, some publicly available blogs which have an open REST API include:



```
https://ma.tt/wp-json/wp/v2
https://blog.ted.com/wp-json/wp/v2
https://finland.fi/wp-json/wp/v2
https://www.nasa.gov/wp-json/wp/v2
```

- [ ] Find your WordPress installation's REST API URL


Once you've confirmed your WordPress installation has an open API URL, you can get started.



### A note on Multisite URLs



For WordPress Multisite / Network setups, you will need to target a specific site by adding its name to the URL.



```
https://<your-domain-name>/<site-name>/wp-json/wp/v2
```

### Testing your REST API URL



During the following lessons, you'll write a script that queries the URL for each content type (posts, pages, etc) individually. From your terminal, you can see how many documents of each type can be found, by using `curl` and returning the headers.



Run the following in the terminal, with your REST API URL:



```sh:Terminal
curl -sI https://<your-domain-name>/wp-json/wp/v2/posts | grep -i '^x-wp'
```

You should receive a response like this, where you can see the total number of publicly available posts



```sh:Terminal
x-wp-total: 21492
x-wp-totalpages: 2150
```

- `x-wp-total` is the number of publicly queryable posts

- `x-wp-totalpages` is the number of paginated responses you can query through to retrieve them individually, using whatever `posts_per_page` setting is the default for your site—or added to your query. The script you write in this course will query for 100.


Now you've confirmed the URL works, and you can see how many publicly available posts can be queried from your WordPress installation, let's prepare a new home for your content.



---

## Lesson 3: Preparing a Studio and schema types
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/preparing-a-studio-and-schema-types

Configure the Sanity Studio schema to view your imported WordPress content in real-time.

You *could* begin immediately importing data of any shape into the Content Lake before configuring your Sanity Studio with a schema. 



By preparing the schema types first, you'll gain the extremely satisfying feedback loop of seeing the Studio update in real time with your imported content. It also makes it possible to collaborate with a content team and get feedback on how the editorial experience works for them.



Notice that we are adapting the implied content model from WordPress fairly 1:1 in this part of the course. Later, we'll walk you through extracting and structuring WordPress content into a more refined content model.



## Initialize a Sanity project



Unless you're working on an existing project, start a new free Sanity project from the command line:



```sh
npm create sanity@latest -- --template clean --create-project "Sanity WordPress" --dataset production --typescript --output-path sanity-wordpress
```

- [ ] **Initialize** a new Sanity Studio project folder or navigate to an existing that you want to work from


Once the installation is done, `cd` into your new Studio folder, and run this command to install the [Sanity Icons package](https://icons.sanity.build/all) (replace `npm` with your preferred package manager):



```sh
npm install @sanity/icons
```

- [ ] **Install** `@sanity/icons` as a dependency


## Add document schema types



You'll start by adding schema types to the Studio similar to the response shapes you'll get from WordPress' REST API. 



Note that some of these fields correspond to functionality in WordPress but will only be "data" in a Sanity context. For example, the **status** field will only be data on a document but can be adapted into Sanity's document and workflow model. The **sticky **field can be used and repurposed when you integrate this content in a front end through queries ("give me the newest document withthe **sticky** property"). You can also disregard or delete these fields later if you don't need to keep them around.



**Create** the following files in your Sanity Studio:



- [ ] **Create** a schema type for Pages:


```typescript:./schemaTypes/pageType.ts
import {DocumentIcon} from '@sanity/icons'
import {defineField, defineType} from 'sanity'

export const pageType = defineType({
  name: 'page',
  title: 'Page',
  type: 'document',
  icon: DocumentIcon,
  fields: [
    defineField({name: 'title', type: 'string'}),
    defineField({name: 'slug', type: 'slug'}),
    defineField({name: 'date', type: 'datetime'}),
    defineField({name: 'modified', type: 'datetime'}),
    defineField({
      name: 'status',
      type: 'string',
      options: {
        list: [
          {title: 'Published', value: 'publish'},
          {title: 'Future', value: 'future'},
          {title: 'Draft', value: 'draft'},
          {title: 'Pending', value: 'pending'},
          {title: 'Private', value: 'private'},
          {title: 'Trash', value: 'trash'},
          {title: 'Auto-Draft', value: 'auto-draft'},
          {title: 'Inherit', value: 'inherit'},
        ],
      },
    }),
    defineField({
      name: 'content',
      type: 'portableText',
    }),
    defineField({
      name: 'excerpt',
      type: 'portableText',
    }),
    defineField({name: 'featuredMedia', type: 'image'}),
    defineField({
      name: 'author',
      type: 'reference',
      to: [{type: 'author'}],
    }),
  ],
  preview: {
    select: {
      title: 'title',
      subtitle: 'author.name',
      media: 'featuredMedia',
    },
  },
})
```

- [ ] **Create** a schema type for Posts


```typescript:./schemaTypes/postType.ts
import {ComposeIcon} from '@sanity/icons'
import {defineField, defineType} from 'sanity'

export const postType = defineType({
  name: 'post',
  title: 'Post',
  type: 'document',
  icon: ComposeIcon,
  fields: [
    defineField({name: 'title', type: 'string'}),
    defineField({name: 'slug', type: 'slug'}),
    defineField({name: 'date', type: 'datetime'}),
    defineField({name: 'modified', type: 'datetime'}),
    defineField({
      name: 'status',
      type: 'string',
      options: {
        list: [
          {title: 'Published', value: 'publish'},
          {title: 'Future', value: 'future'},
          {title: 'Draft', value: 'draft'},
          {title: 'Pending', value: 'pending'},
          {title: 'Private', value: 'private'},
          {title: 'Trash', value: 'trash'},
          {title: 'Auto-Draft', value: 'auto-draft'},
          {title: 'Inherit', value: 'inherit'},
        ],
      },
    }),
    defineField({
      name: 'content',
      type: 'portableText',
    }),
    defineField({
      name: 'excerpt',
      type: 'portableText',
    }),
    defineField({name: 'featuredMedia', type: 'image'}),
    defineField({name: 'sticky', type: 'boolean'}),
    defineField({
      name: 'author',
      type: 'reference',
      to: [{type: 'author'}],
    }),
    defineField({
      name: 'categories',
      type: 'array',
      of: [{type: 'reference', to: [{type: 'category'}]}],
    }),
    defineField({
      name: 'tags',
      type: 'array',
      of: [{type: 'reference', to: [{type: 'tag'}]}],
    }),
  ],
  preview: {
    select: {
      title: 'title',
      subtitle: 'author.name',
      media: 'featuredMedia',
    },
  },
})
```

- [ ] **Create** a schema type for tags


```typescript:./schemaTypes/tagType.ts
import {TagIcon} from '@sanity/icons'
import {defineField, defineType} from 'sanity'

export const tagType = defineType({
  name: 'tag',
  title: 'Tag',
  type: 'document',
  icon: TagIcon,
  fields: [defineField({name: 'name', type: 'string'}), defineField({name: 'slug', type: 'slug'})],
  preview: {
    select: {
      title: 'name',
      subtitle: 'slug.current',
    },
  },
})
```

- [ ] **Create** a schema type for categories


```typescript:./schemaTypes/categoryType.ts
import {FilterIcon} from '@sanity/icons'
import {defineField, defineType} from 'sanity'

export const categoryType = defineType({
  name: 'category',
  title: 'Category',
  type: 'document',
  icon: FilterIcon,
  fields: [
    defineField({name: 'name', type: 'string'}),
    defineField({name: 'slug', type: 'slug'}),
  ],
  preview: {
    select: {
      title: 'name',
      subtitle: 'slug.current',
    },
  },
})
```

- [ ] **Create** a schema type for authors


```typescript:./schemaTypes/authorType.ts
import {UserIcon} from '@sanity/icons'
import {defineField, defineType} from 'sanity'

export const authorType = defineType({
  name: 'author',
  title: 'Author',
  type: 'document',
  icon: UserIcon,
  fields: [
    defineField({name: 'name', type: 'string'}),
    defineField({name: 'slug', type: 'slug'}),
    defineField({name: 'url', title: 'URL', type: 'url'}),
    defineField({name: 'description', type: 'text'}),
    defineField({name: 'avatar', type: 'image'}),
  ],
  preview: {
    select: {
      title: 'name',
      subtitle: 'url',
      media: 'avatar',
    },
  },
})
```

### Users vs authors



WordPress stores a document's "author" in the database as a "user." While keeping track of *who did what* to all documents in the Content Lake, Sanity does not automatically add project member data to your documents. The advantage of this separation is that you can treat "authors" more as an editorial concern, extend the author model as you wish to, and have authors that don't correspond to project members (for example, guest authors outside of your organization). This also allows you to migrate the single author field into an array of authors fields later.



So, you will create an "author" document for every WordPress "user" and reference that document in posts and pages.



> [!TIP]
> The [Sanity User Select input](https://www.sanity.io/plugins/sanity-plugin-user-select-input) is a useful plugin if you need to relate Sanity project members to a document.



## Add custom schema field types



The document schema types created above use custom schema types that must also be registered to the Studio configuration.



A field for storing rich text and block content, including images, and a custom schema type for storing an external image URL.



- [ ] **Create** a schema type for Portable Text


```typescript:./schemaTypes/portableTextType.ts
import {defineField} from 'sanity'

export const portableTextType = defineField({
  name: 'portableText',
  type: 'array',
  of: [{type: 'block'}, {type: 'image'}, {type: 'externalImage'}],
})

```

The custom type for an external image, useful during the migration and upload process.



- [ ] **Create** a schema type for the `externalImage` block


```typescript:./schemaTypes/externalImageType.ts
import {defineType} from 'sanity'

export const externalImageType = defineType({
  name: 'externalImage',
  title: 'External Image',
  type: 'object',
  fields: [
    {
      name: 'url',
      title: 'URL',
      type: 'url',
    },
  ],
})
```

## Add schema types to your Studio workspace configuration



- [ ] Add these to your `schemaTypes` array in your `sanity.config.ts` file. 


```typescript:./schemaTypes/index.ts
import {authorType} from './authorType'
import {categoryType} from './categoryType'
import {externalImageType} from './externalImageType'
import {pageType} from './pageType'
import {postType} from './postType'
import {tagType} from './tagType'
import {portableTextType} from './portableTextType'

export const schemaTypes = [
  authorType,
  categoryType,
  pageType,
  postType,
  tagType,
  externalImageType,
  portableTextType
]
```

Now, run your development server:



```sh
npm run dev
```

Open your Studio at [http://localhost:3333](http://localhost:3333/). Once logged in, you should see the Structure tool with your five document schema types.



![Sanity Studio showing structure with five document types](https://cdn.sanity.io/images/3do82whm/next/4b7250c33296dd69018ae0bfda1d3ee10e50e26b-2144x1388.png)

With the Studio primed to author and query content, let's prepare a basic migration script to import some content.



---

## Lesson 4: Preparing your migration script
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/preparing-your-cli-script

Import WordPress content into Sanity in bulk using the CLI Migration tool, creating a script that queries the WordPress REST API and writes documents to Sanity.

There are several ways to write scripts to import content into Sanity in bulk, which are covered in the [Scripting content migrations](https://www.sanity.io/learn/course/refactoring-content/scripting-content-migrations) lesson. For this course, you'll use the CLI Migration tooling, as recommended.



> [!TIP]
> See the documentation for more about [Migrations CLI command reference](https://www.sanity.io/learn/cli-reference/cli-migrations).


> [!TIP]
> Take the [Handling schema changes confidently](https://www.sanity.io/learn/course/handling-schema-changes-confidently) course to learn more about the tooling.



In the root of your Studio project, create a new migration script called "Import WP":



```sh
npx sanity@latest migration create "Import WP"
```

- Skip defining document types in the following prompt

- Choose the "minimalistic migration" template because you'll replace this file anyway.

- [ ] **Bootstrap** a migration script with the instructions above


Once finished, you should have a document in your Studio at the following path that looks something like this:



```typescript:./migrations/import-wp/index.ts
import {at, defineMigration, setIfMissing, unset} from 'sanity/migrate'

export default defineMigration({
  title: 'import-wp',

  migrate: {
    document(doc, context) {
      // ...and so on
```

Migration scripts are primarily used for handling schema changes for a dataset in the Sanity Content Lake, such as renaming fields and removing or changing specific values across documents.



You can use the same tooling to do whatever you like – including creating new documents in bulk. The migration script you're creating won't target *existing* documents in the dataset of an existing type but rather query the WordPress REST API, iterate over the results, and return new `createOrReplace` mutations, making the script *idempotent.*



The major benefit of using the migration tooling is that it will automatically batch mutations into transactions to avoid hitting rate limits. It also supports testing with "dry runs" by default and provides visual feedback in the terminal when running the script.



## Helpers and dependencies



You'll use some development dependencies as you build out the migration script. To save time, install them all now.



- [ ] **Install** these development dependencies to your Studio project.


```
npm install -D wp-types html-entities p-limit @portabletext/block-tools @wordpress/block-serialization-default-parser jsdom
```

- [wp-types](https://www.npmjs.com/package/wp-types) is a collection of WordPress types for API responses

- [html-entities](https://www.npmjs.com/package/html-entities) contains a helper function for decoding entities in HTML strings

- [p-limit](https://www.npmjs.com/package/p-limit) can throttle the number of concurrent asynchronous operations in an array of promises – used when uploading assets

- [@portabletext/block-tools](https://www.npmjs.com/package/@portabletext/block-tools) converts HTML strings to Portable Text

- [jsdom](https://www.npmjs.com/package/jsdom) converts HTML strings into a DOM that can be traversed

- [@wordpress/block-serialization-default-parser](https://www.npmjs.com/package/@wordpress/block-serialization-default-parser) can convert pre-processed HTML stored in the WordPress block editor into an array of block objects

- [ ] **Create** a file to store all the Types used in your migration script.


```typescript:./migrations/import-wp/types.ts
import type {
  WP_REST_API_Categories,
  WP_REST_API_Pages,
  WP_REST_API_Posts,
  WP_REST_API_Tags,
  WP_REST_API_Users,
} from 'wp-types'

export type WordPressDataType = 'categories' | 'posts' | 'pages' | 'tags' | 'users'

export type WordPressDataTypeResponses = {
  categories: WP_REST_API_Categories
  posts: WP_REST_API_Posts
  pages: WP_REST_API_Pages
  tags: WP_REST_API_Tags
  users: WP_REST_API_Users
}

export type SanitySchemaType = 'category' | 'post' | 'page' | 'tag' | 'author'
```

- [ ] **Create** a file to store constants and update the `BASE_URL` variable to use **your** WordPress website's URL.


```typescript:./migrations/import-wp/constants.ts
// Replace this with your WordPress site's WP-JSON REST API URL
export const BASE_URL = `https://<your-domain>/wp-json/wp/v2`
export const PER_PAGE = 100
```

- [ ] **Create** a helper function for returning a page of results from WordPress REST API


```typescript:./migrations/import-wp/lib/wpDataTypeFetch.ts
import {BASE_URL, PER_PAGE} from '../constants'
import type {WordPressDataType, WordPressDataTypeResponses} from '../types'

export async function wpDataTypeFetch<T extends WordPressDataType>(
  type: T,
  page: number
): Promise<WordPressDataTypeResponses[T]> {
  const wpApiUrl = new URL(`${BASE_URL}/${type}`)
  wpApiUrl.searchParams.set('page', page.toString())
  wpApiUrl.searchParams.set('per_page', PER_PAGE.toString())

  return fetch(wpApiUrl).then((res) => (res.ok ? res.json() : null))
}
```

With these files created, you should now have a directory structure at the root of your Studio like this:



```
migrations
└── import-wp
    ├── types.ts
    ├── index.ts
    ├── constants.ts
    └── lib
        └── wpDataTypeFetch.ts
```

## Update the migration script



With all required dependencies installed and some basic helpers, replace the migration script created by the CLI.



- [ ] **Update** your migration script entirely with the code below.


```typescript:./migrations/import-wp/index.ts
import type {SanityDocumentLike} from 'sanity'
import {createOrReplace, defineMigration} from 'sanity/migrate'

import {wpDataTypeFetch} from './lib/wpDataTypeFetch'

// This will import `post` documents into Sanity from the WordPress API
export default defineMigration({
  title: 'Import WP',

  async *migrate() {
    const wpType = 'posts'
    let page = 1
    let hasMore = true

    while (hasMore) {
      try {
        const wpData = await wpDataTypeFetch(wpType, page)

        if (Array.isArray(wpData) && wpData.length) {
          const docs: SanityDocumentLike[] = []

          for (const wpDoc of wpData) {
            const doc: SanityDocumentLike = {
              _id: `post-${wpDoc.id}`,
              _type: 'post',
              title: wpDoc.title?.rendered.trim(),
            }

            docs.push(doc)
          }

          yield docs.map((doc) => createOrReplace(doc))
          page++
        } else {
          hasMore = false
        }
      } catch (error) {
        console.error(`Error fetching data for page ${page}:`, error)
        // Stop the loop in case of an error
        hasMore = false
      }
    }
  },
})
```

This new version of the script queries the WordPress REST API URL set in your `constants.ts` file inside the `wpDataTypeFetch` function. 



It should return up to 100 posts, which are then iterated over in a `for of` loop to stage a new Sanity document for each:



- The `_id` is generated from the WordPress post ID

- The `_type` is set to `post`, which means that these documents will appear under Post in your studio

- We also set the `title` here so it's easier to visualize how this script works. 


You might have HTML entities in your titles – you will deal with these later in this course.



The `while` loop will keep querying for another 100 posts each time by paginating the results until it finds no more. Yes, similar to the iconic `while(have_posts() : the_posts())` WordPress loop.



As these posts are being "staged," the migration tooling will batch them into transactions, which are committed once the transaction reaches a certain size.



- [ ] **Run** your migration script. By default, it will perform a "dry run" where nothing is written to the dataset.


```sh
npx sanity@latest migration run import-wp
```

You should get visual feedback in the terminal for the posts that would be created. Something like this:



```sh
Running migration "import-wp" in dry mode

Project id:  f3lbec6z
Dataset:     production

 createOrReplace   post  post-685064
{
  "_id": "post-685064",
  "_type": "post",
  "title": "Robotic Assembly and Outfitting for NASA Space Missions"
}
```

- [ ] **Run** the script again, with dry run disabled, to write documents to your dataset. 


You will need to confirm you wish to proceed.



```sh
npx sanity@latest migration run import-wp --no-dry-run
```

Once run, you should receive a summary of the finished mutations and transactions (numbers will vary).



```sh
  0 documents processed.
  30 mutations generated.
  1 transactions committed.
```

Fantastic! You now have a migration script that can query the WordPress REST API and write as many documents to Sanity as it finds. 



However, it only looks for posts.



Let's make this script much smarter with a few options.



---

## Lesson 5: Processing post types
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/support-importing-many-post-types

Create a dynamic migration script for WordPress post and pages types that accepts custom content transformation options to unlock reuse potential.

Currently, the migration script is hard-coded only to query WordPress for `posts` and recreate new `post` documents.



You'll want to be able to import many WordPress built-in types, as well as custom types. It's beneficial to be able to choose which type to import when executing the migration script like this:



```sh:Terminal
# what we'll setup in this lesson
npx sanity@latest migration run import-wp --type=tags
```

In the example above `--type=tags` is an argument that can be passed from the terminal and read during the script's execution. Let's update the migration script to accept a specific WordPress post type when run and select the appropriate Sanity Studio schema type to create new documents.



- [ ] **Update** your constants file to create a map of WordPress types to Sanity Studio schema types


```typescript:migrations/import-wp/constants.ts
import type {SanitySchemaType, WordPressDataType} from './types'

// Replace this with your WordPress site's WP-JSON REST API URL
export const BASE_URL = 'https://<your-domain>/wp-json/wp/v2'
export const PER_PAGE = 100
  
export const WP_TYPE_TO_SANITY_SCHEMA_TYPE: Record<WordPressDataType, SanitySchemaType> = {
  categories: 'category',
  posts: 'post',
  pages: 'page',
  tags: 'tag',
  users: 'author',
}
```

You'll extend this object in another lesson when importing custom post types. You'll also account for situations where something that lives in WordPress as a page may actually become structured content of a different type in Sanity.



- [ ] **Create** a helper function to read the current arguments and throw an error if they are missing or invalid:


```typescript:./migrations/import-wp/lib/getDataTypes.ts
import {WP_TYPE_TO_SANITY_SCHEMA_TYPE} from '../constants'
import type {SanitySchemaType, WordPressDataType} from '../types'

// Get WordPress type from CLI arguments, and the corresponding Sanity schema type
export function getDataTypes(args: string[]): {
  wpType: WordPressDataType
  sanityType: SanitySchemaType
} {
  let wpType = args
    .find((a) => a.startsWith('--type='))
    ?.split('=')
    .pop() as WordPressDataType
  let sanityType = WP_TYPE_TO_SANITY_SCHEMA_TYPE[wpType]

  if (!wpType || !sanityType) {
    throw new Error(
      `Invalid WordPress data type, specify a with --type= ${Object.keys(
        WP_TYPE_TO_SANITY_SCHEMA_TYPE,
      ).join(', ')}`,
    )
  }

  return {wpType, sanityType}
}
```

This script will accept the current arguments (from [process.argv](https://nodejs.org/docs/latest/api/process.html#processargv)) and return both the WordPress type being queried and the matching Sanity schema type to use for creating new documents.



So, for example, if you run:



```
npx sanity@latest migration run import-wp --type=tags
```

The helper function above will return:



```json
{"wpType": "tags", "sanityType": "tag"}
```

### Update your migration script



Now you can implement this into your main migration script.



- [ ] **Replace** your migration script with the code below, which uses this new helper function.


```typescript:migrations/wp-import/index.ts
import {decode} from 'html-entities'
import type {SanityDocumentLike} from 'sanity'
import {createOrReplace, defineMigration} from 'sanity/migrate'
import type {WP_REST_API_Post, WP_REST_API_Term, WP_REST_API_User} from 'wp-types'

import {getDataTypes} from './lib/getDataTypes'
import {wpDataTypeFetch} from './lib/wpDataTypeFetch'

// Allow the migration script to import a specific post type when run
export default defineMigration({
  title: 'Import WP JSON data',

  async *migrate() {
    const {wpType, sanityType} = getDataTypes(process.argv)
    let page = 1
    let hasMore = true

    while (hasMore) {
      try {
        let wpData = await wpDataTypeFetch(wpType, page)

        if (Array.isArray(wpData) && wpData.length) {
          const docs: SanityDocumentLike[] = []

          for (let wpDoc of wpData) {
            const doc: SanityDocumentLike = {
              _id: `${sanityType}-${wpDoc.id}`,
              _type: sanityType,
            }

            if (wpType === 'posts' || wpType === 'pages') {
              wpDoc = wpDoc as WP_REST_API_Post
              doc.title = decode(wpDoc.title.rendered).trim()
            } else if (wpType === 'categories' || wpType === 'tags') {
              wpDoc = wpDoc as WP_REST_API_Term
              doc.name = decode(wpDoc.name).trim()
            } else if (wpType === 'users') {
              wpDoc = wpDoc as WP_REST_API_User
              doc.name = decode(wpDoc.name).trim()
            }

            docs.push(doc)
          }

          yield docs.map((doc) => createOrReplace(doc))
          page++
        } else {
          hasMore = false
        }
      } catch (error) {
        console.error(`Error fetching data for page ${page}:`, error)
        // Stop the loop in case of an error
        hasMore = false
      }
    }
  },
})
```

The key changes are highlighted in the code above. Now, when running the migration script, you must supply a valid WordPress type, and the script will create documents differently depending on which type they are.



Note also the `decode` function, which will convert HTML entities in your WordPress REST API response. The string is also trimmed as it may begin or end with whitespace. 



You could use a runtime validation library [such as Zod](https://zod.dev/) to check the validity of every value added to a staged document before it is added to the transaction.



- [ ] **Run** the script now with a `type` argument


```
npx sanity@latest migration run import-wp --no-dry-run --type=posts
```

You should now see the same post documents with titles in your Studio. You can re-run the script now for pages, categories, and tags and see the Studio update with them as transactions complete:



```
npx sanity@latest migration run import-wp --no-dry-run --type=pages
npx sanity@latest migration run import-wp --no-dry-run --type=categories
npx sanity@latest migration run import-wp --no-dry-run --type=tags
npx sanity@latest migration run import-wp --no-dry-run --type=users
```

Your script now handles multiple post types that are core to WordPress. In the next few lessons, we'll create more complete documents.



---

## Lesson 6: Creating complete documents
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/creating-complete-documents

Transform WordPress posts into more complete Sanity Studio documents with categories, authors, dates and more, using dedicated functions for each post type.

Now the migration script can create documents of several different types; it's time to create more complete documents. In this lesson, you'll focus on creating a dedicated function for processing different post type documents. You can then repeat these steps for all other document types.



Instances where it makes sense to transform content into a different content type – like from a page to structured content – will be covered in the next lesson.



## Leverage Sanity TypeGen



So far, the migration script has been using SanityDocumentLike as a TypeScript type for staged documents. This definition is too *loose* to be useful. Since you have the Sanity Studio schema types created for the types of documents being created, Sanity TypeGen can create more helpful Types for the documents you're creating.



- [ ] **Run** the following command in your Sanity Studio project folder to extract your schema definitions


```
npx sanity@latest schema extract
```

You should now have a `schema.json` file at the root of your Studio project.



- [ ] **Run** the following command to generate Types from your schema


```sh
npx sanity@latest typegen generate      
```

You should now have a `sanity.types.ts` file at the root of your Studio project.



> [!TIP]
> For more about Sanity TypeGen see [Generating types](https://www.sanity.io/learn/course/day-one-with-sanity-studio/generating-types)



## Update the migration script



In the previous lesson, you updated your script to write a `title` to pages and posts, and a `name` to tags and categories. All of these document types will have many more attributes, so to simplify things, you will now create dedicated functions for each post type.



This lesson will focus on just preparing `post` type documents.



- [ ] **Create** a new helper function to transform a WordPress post into a Sanity Studio post:


```typescript:migrations/import-wp/lib/transformToPost.ts
import {decode} from 'html-entities'
import type {WP_REST_API_Post} from 'wp-types'

import type {Post} from '../../../sanity.types'

// Remove these keys because they'll be created by Content Lake
type StagedPost = Omit<Post, '_createdAt' | '_updatedAt' | '_rev'>

export async function transformToPost(wpDoc: WP_REST_API_Post): Promise<StagedPost> {
  const doc: StagedPost = {
    _id: `post-${wpDoc.id}`,
    _type: 'post',
  }

  doc.title = decode(wpDoc.title.rendered).trim()

  return doc
}
```

This helper function has the same utility as was written directly into the migration script before, so you'll need to update that script to use it.



- [ ] **Update** the migration file:


```typescript:migrations/import-wp/index.ts
import type {SanityDocumentLike} from 'sanity'
import {createOrReplace, defineMigration} from 'sanity/migrate'
import type {WP_REST_API_Post, WP_REST_API_Term, WP_REST_API_User} from 'wp-types'

import {getDataTypes} from './lib/getDataTypes'
import {transformToPost} from './lib/transformToPost'
import {wpDataTypeFetch} from './lib/wpDataTypeFetch'

export default defineMigration({
  title: 'Import WP JSON data',

  async *migrate() {
    const {wpType} = getDataTypes(process.argv)
    let page = 1
    let hasMore = true

    while (hasMore) {
      try {
        let wpData = await wpDataTypeFetch(wpType, page)

        if (Array.isArray(wpData) && wpData.length) {
          const docs: SanityDocumentLike[] = []

          for (let wpDoc of wpData) {
            if (wpType === 'posts') {
              wpDoc = wpDoc as WP_REST_API_Post
              const doc = await transformToPost(wpDoc)
              docs.push(doc)
            } else if (wpType === 'pages') {
              wpDoc = wpDoc as WP_REST_API_Post
              // add your *page* transformation function
            } else if (wpType === 'categories') {
              wpDoc = wpDoc as WP_REST_API_Term
              // add your *category* transformation function
            } else if (wpType === 'tags') {
              wpDoc = wpDoc as WP_REST_API_Term
              // add your *tag* transformation function
            } else if (wpType === 'users') {
              wpDoc = wpDoc as WP_REST_API_User
              // add your *author* transformation function
            }
          }

          yield docs.map((doc) => createOrReplace(doc))
          page++
        } else {
          hasMore = false
        }
      } catch (error) {
        console.error(`Error fetching data for page ${page}:`, error)
        // Stop the loop in case of an error
        hasMore = false
      }
    }
  },
})
```

The migration script still performs the same actions as before; however, it no longer creates pages, categories, or tags. You must create your own "transform" functions for each type individually.



## Completing the transformation



With your script ready to uniquely handle each post type, you can add more attributes to each post type. The following is a step-by-step walkthrough of these attributes; you'll find a completed example at the bottom of this lesson.



### Shaping slugs



Sanity Studio's slug field type stores its value inside an object, so you must convert it appropriately.



- [ ] Add the slug transformation to your transform function


```typescript
if (wpDoc.slug) {
  doc.slug = { _type: 'slug', current: wpDoc.slug }
}
```

### Adding taxonomies as references



Categories and Tags are present in the WordPress REST API response as an array of numbers. 



```json
"categories": [2864, 502],
```

These match the IDs in the WordPress database. Since we have used deterministic IDs in imported documents, you can convert this array of numbers to an array of references. Here is an example for categories:



- [ ] Add the category reference transformation to your transform function

- [ ] Repeat this logic for tags.


```typescript
if (Array.isArray(wpDoc.categories) && wpDoc.categories.length) {
  doc.categories = wpDoc.categories.map((catId) => ({
    _type: 'reference',
    _ref: `category-${catId}`
  }))
}
```

These category (and author) documents need to exist in the dataset **before** you can write a document that references them. Ensure you've run imports for these post types already.



Note that you *can* send array items without a `_key` attribute and Content Lake can automatically generate create one for you, but because TypeScript complains, one is included in the final code at the end of this lesson.



### Adding an author reference



A post typically only has one user. You would create a single reference to an "author" document created during the import process. You can use the migration tooling to turn this into an array of authors later if you need to support multiple authors in your front end.



- [ ] Add author reference to your transform function


```typescript
if (wpDoc.author) {
  doc.author = {
    _type: 'reference',
    _ref: `author-${wpDoc.author}`
  }
}
```

### Date fields



As detailed in the [Setting created and modified dates](https://www.sanity.io/learn/course/refactoring-content/setting-created-and-modified-dates) lesson, while it is *possible* to set the `_createdAt` and `_updatedAt` attributes in a mutation, it is not *recommended* if these dates have editorial meaning in your content. 



Therefore, it's best to add them as individual `datetime` fields. So, they have been included in the Sanity Studio post schema as the fields `date` and `modified`.



- [ ] Add the date-field transformations to your migration script.


```typescript
if (wpDoc.date) {
  doc.date = wpDoc.date
}

if (wpDoc.modified) {
  doc.modified = wpDoc.modified
}
```

### Status and sticky



These fields have explicit meaning in your WordPress installation but imply logic that must be recreated in your front end. With Sanity, you are more likely to use a document's draft or published status than a string value. But you are welcome to import it as part of this migration.



- [ ] Add (or omit) the sticky transformation to your migration script.


```typescript
if (wpDoc.status) {
  doc.status = wpDoc.status as StagedPost['status']
}

doc.sticky = wpDoc.sticky == true
```

### Custom and meta fields



This example is not included in the final script; it is an example for you to implement if it relates to your content model.



Through plugins such as Advanced Custom Fields or YoastSEO your content may have additional content such as taxonomy references or string fields. Here is an example field that is not part of WordPress core but could be present in your data:



```json
"read_time": 22,
```

First, create a matching field name in your Sanity Studio schema:



```typescript
defineField({name: 'readTime', type: 'number'})
```

Remember to re-run `schema extract` and `typegen generate` after each schema change!



And add it to your transform function:



```typescript
if (wpDoc.read_time) {
  doc.readTime = wpDoc.read_time
}
```

- [ ] Add any desired custom field transformations to your migration script.


## Put it all together



Review the transformation function now with all of the extra fields described above. This will add every field except the featured media, content, and excerpt fields, which are covered in later lessons.



- [ ] **Review and update** your `transformToPost` file to stage the remaining attributes.


```typescript:migrations/import-wp/lib/transformToPost.ts
import {uuid} from '@sanity/uuid'
import {decode} from 'html-entities'
import type {WP_REST_API_Post} from 'wp-types'

import type {Post} from '../../../sanity.types'

// Remove these keys because they'll be created by Content Lake
type StagedPost = Omit<Post, '_createdAt' | '_updatedAt' | '_rev'>

export async function transformToPost(wpDoc: WP_REST_API_Post): Promise<StagedPost> {
  const doc: StagedPost = {
    _id: `post-${wpDoc.id}`,
    _type: 'post',
  }

  doc.title = decode(wpDoc.title.rendered).trim()

  if (wpDoc.slug) {
    doc.slug = {_type: 'slug', current: wpDoc.slug}
  }

  if (Array.isArray(wpDoc.categories) && wpDoc.categories.length) {
    doc.categories = wpDoc.categories.map((catId) => ({
      _key: uuid(),
      _type: 'reference',
      _ref: `category-${catId}`,
    }))
  }

  if (Array.isArray(wpDoc.tags) && wpDoc.tags.length) {
    doc.tags = wpDoc.tags.map((tagId) => ({
      _key: uuid(),
      _type: 'reference',
      _ref: `tag-${tagId}`,
    }))
  }  

  if (wpDoc.author) {
    doc.author = {
      _type: 'reference',
      _ref: `author-${wpDoc.author}`,
    }
  }

  if (wpDoc.date) {
    doc.date = wpDoc.date
  }

  if (wpDoc.modified) {
    doc.modified = wpDoc.modified
  }

  if (wpDoc.status) {
    doc.status = wpDoc.status as StagedPost['status']
  }

  doc.sticky = wpDoc.sticky == true

  return doc
}
```

- [ ] **Run** the migration script to create more complete post documents.


```sh
npx sanity@latest migration run import-wp --no-dry-run --type=posts
```

Now open your Sanity Studio, if it wasn't open already, and you should see your post documents have categories, authors, dates and more filled with their correct values. You're getting there!



![Sanity Studio showing a slug field and date fields in a document](https://cdn.sanity.io/images/3do82whm/next/1c0a24bd61cda2d394a635f379405798ff43f321-2144x1388.png)

> [!NOTE]
> Your Studio won't be displaying images yet like the screenshot above, that's next!



Up until this point, migration has only been concerned with text content. It's time to start uploading assets and unpacking the (manageable) complexity that can bring.



---

## Lesson 7: Uploading assets performantly
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/uploading-assets-performantly

So far the migration script has staged documents to be created sequentially, now we need to introduce asynchronous functions to upload assets, that has the potential to slow the import process down.

Your post and page documents likely have a `featured_media` reference that you should upload and reference your new Sanity documents. This lesson will focus on the updated migration script and offer several new helper functions.



## Avoiding slow, sequential loops



Uploading an asset to Sanity while creating documents in migrations is a two-step process:



1. Fetch the asset file from a URL and use Sanity Client to upload it, which returns an asset document with its `_id`.

2. Attach the returned asset document ID to the current document as a reference in an asset field.


Because this operation is asynchronous, it means the creation of a document must wait for that upload and response before creating the asset reference



Currently, the migration script requests up to 100 posts (or pages, categories, or tags) and then loops over them with a `for of` loop. This loop type is *somewhat* practical in this application as you could use an asynchronous function, and the loop will await completion before proceeding.



However, this means each document would have to wait in sequence for an image to upload, making the migration script incredibly slow. That's no way to live!



## Concurrency and rate limits



This does not mean we should go to the other extreme—uploading all 100 images simultaneously—as you would likely encounter the issue of rate limits.



> [!TIP]
> See [Technical limits](https://www.sanity.io/learn/content-lake/technical-limits) for more information about API rate limits.



One benefit of migration tooling is the built-in avoidance of rate limits, as it automatically batches mutations into transactions. However, now that your introducing custom document creation into the script's execution, you need to be a little more careful.



Earlier, you installed `p-limit` as a dependency. This package allows you to create an array of async functions, which, when placed in a `Promise.all()` call, will throttle the number of simultaneous function invocations.



You'll see the updated script has replaced the `for of` loop for a `map` of functions wrapped in the function `limit` from `p-limit`. The script changes from staging each document individually to creating an array of asynchronous staging functions.



## Uploading images efficiently



The migration tooling makes a limited Sanity Client version available inside a `context` variable. As this version does not allow uploading assets, the updated script creates a new, fully-featured instance of Sanity Client using the same `projectId`, `dataset`, and `token` config.



### Appending source metadata



When uploading assets to Sanity, you can also append metadata about the "source" from which it came. This metadata enables more efficient re-running of the script to avoid re-uploading the same images on every invocation.



The WordPress REST API has a route for retrieving information about an image if you have its ID. A function to query that endpoint and return just the metadata we need to store in Sanity will make this more convenient.



- [ ] **Create** a new helper function to query the WordPress REST API's `/media` route for an image by its `id` value.


```typescript:./migrations/import-wp/lib/wpImageFetch.ts
import type {UploadClientConfig} from '@sanity/client'
import {decode} from 'html-entities'

import {BASE_URL} from '../constants'

// Get WordPress' asset metadata about an image by its ID
export async function wpImageFetch(id: number): Promise<UploadClientConfig | null> {
  const wpApiUrl = new URL(`${BASE_URL}/media/${id}`).toString()
  const imageData = await fetch(wpApiUrl).then((res) => res.json())

  if (!imageData || !imageData.source_url) {
    return null
  }

  let metadata: UploadClientConfig = {
    filename: imageData.source_url.split('/').pop(),
    source: {
      id: imageData.id,
      name: 'WordPress',
      url: imageData.source_url,
    },
    // Not technically part of the Sanity imageAsset schema, but used by the popular Media Plugin
    // @ts-expect-error
    altText: imageData.alt_text,
  }

  if (imageData?.title?.rendered) {
    metadata.title = decode(imageData.title.rendered)
  }

  if (imageData?.image_meta?.caption) {
    metadata.description = imageData.image_meta.caption
  }

  if (imageData?.image_meta?.credit) {
    metadata.creditLine = imageData.image_meta.credit
  }

  return metadata
}
```

When you use this function to retrieve an image record from WordPress, you'll need to pass it along to the function that uploads the image to Sanity.



- [ ] **Create** a helper function to upload an image to Sanity – using its URL – along with optional metadata:


```typescript:./migrations/import-wp/lib/sanityUploadFromUrl.ts
import {Readable} from 'node:stream'

import type {SanityClient, SanityImageAssetDocument, UploadClientConfig} from '@sanity/client'

export async function sanityUploadFromUrl(
  url: string,
  client: SanityClient,
  metadata: UploadClientConfig,
): Promise<SanityImageAssetDocument | null> {
  const {body} = await fetch(url)
  if (!body) {
    throw new Error(`No body found for ${url}`)
  }
  let data: SanityImageAssetDocument | null = null
  try {
    data = await client.assets.upload(
      'image',
      Readable.fromWeb(body),
      metadata,
    )
  } catch (error) {
    console.error(`Failed to upload image from ${url}`)
    console.error(error)

    return null
  }

  return data
}
```

This function returns a Sanity image asset document, the `_id` value you'll use to create a reference to this asset.



The image schema type in Sanity stores a reference in the `asset` attribute. Since you'll be uploading many images, getting their ID, and creating a reference, having a helper function for this simple task makes sense.



- [ ] **Create** a helper function to take the `_id` of an asset document and return the shape of an asset reference in a document:


```typescript:./migrations/import-wp-lib/sanityIdToImageReference.ts
import type {Post} from '../../../sanity.types'

export function sanityIdToImageReference(id: string): Post['featuredMedia'] {
  return {
    _type: 'image',
    asset: {_type: 'reference', _ref: id},
  }
}
```

Note that the return type of this function is set to the `featuredMedia` field of a post – but it should satisfy any image field.



Now that you have functions to query WordPress for an image, upload it to Sanity, and create a reference in a document. It is advantageous to have one more function that will query for existing images from the same source at the beginning of the migration script – to avoid re-uploading images unnecessarily.



- [ ] **Create **a helper function to query for previously uploaded images from WordPress.


```typescript:./migrations/import-wp/lib/sanityFetchImages.ts
import type {SanityClient} from 'sanity'

const query = `*[
    _type == "sanity.imageAsset" 
    && defined(source.id)
    && source.name == "WordPress"
]{
    _id,
    "sourceId": source.id
}`

export async function sanityFetchImages(client: SanityClient) {
  const initialImages = await client.fetch<{_id: string; sourceId: number}[]>(query)
  const existingImages: Record<number, string> = {}

  for (let index = 0; index < initialImages.length; index++) {
    existingImages[initialImages[index].sourceId] = initialImages[index]._id
  }

  return existingImages
}
```

This query will return all images in the dataset that have been uploaded with the source attributes our helpers use, then convert the response into an object for a basic (but fast!) key-value in-memory cache.



## Putting it all together



Now, with a strategy to query for and upload images efficiently, update your migration script below to put these pieces into place.



- [ ] **Update** your migration script to be asynchronous and throttled:


```typescript:./migrations/import-wp/index.ts
import {createClient} from '@sanity/client'
import pLimit from 'p-limit'
import {createOrReplace, defineMigration} from 'sanity/migrate'
import type {WP_REST_API_Post, WP_REST_API_Term} from 'wp-types'

import {getDataTypes} from './lib/getDataTypes'
import {sanityFetchImages} from './lib/sanityFetchImages'
import {transformToPost} from './lib/transformToPost'
import {wpDataTypeFetch} from './lib/wpDataTypeFetch'

const limit = pLimit(5)

// Add image imports, parallelized and limited
export default defineMigration({
  title: 'Import WP JSON data',

  async *migrate(docs, context) {
    // Create a full client to handle image uploads
    const client = createClient(context.client.config())

    // Create an in-memory image cache to avoid re-uploading images
    const existingImages = await sanityFetchImages(client)

    const {wpType} = getDataTypes(process.argv)
    let page = 1
    let hasMore = true

    while (hasMore) {
      try {
        let wpData = await wpDataTypeFetch(wpType, page)

        if (Array.isArray(wpData) && wpData.length) {
          // Create an array of concurrency-limited promises to stage documents
          const docs = wpData.map((wpDoc) =>
            limit(async () => {
              if (wpType === 'posts') {
                wpDoc = wpDoc as WP_REST_API_Post
                const doc = await transformToPost(wpDoc, client,  existingImages)
                return doc
              } else if (wpType === 'pages') {
                wpDoc = wpDoc as WP_REST_API_Post
              } else if (wpType === 'categories') {
                wpDoc = wpDoc as WP_REST_API_Term
              } else if (wpType === 'tags') {
                wpDoc = wpDoc as WP_REST_API_Term
              }

              hasMore = false
              throw new Error(`Unhandled WordPress type: ${wpType}`)
            }),
          )

          // Resolve all documents concurrently, throttled by p-limit
          const resolvedDocs = await Promise.all(docs)

          yield resolvedDocs.map((doc) => createOrReplace(doc))
          page++
        } else {
          hasMore = false
        }
      } catch (error) {
        console.error(`Error fetching data for page ${page}:`, error)
        // Stop the loop in case of an error
        hasMore = false
      }
    }
  },
})
```

There are some significant changes in the migration script above:



- Instead of staging documents one by one, they're now set up in an array with a limit function, then using p-limit, are resolved at most five at a time. This is to prevent any issues with rate limits as images are uploaded during the migration.

- The in-memory cache of existing images is queried before any migration begins.

- These images and Sanity Client are passed into the post-transform function.


### Update the transform function



With the migration script set up to handle asynchronous functions, the `transformToPost` script needs to be updated to perform them.



- [ ] **Update** the transformToPost function to add image uploads.


```typescript:migrations/import-wp/lib/transformToPost.ts
import {uuid} from '@sanity/uuid'
import {decode} from 'html-entities'
import type {SanityClient} from 'sanity'
import type {WP_REST_API_Post} from 'wp-types'

import type {Post} from '../../../sanity.types'
import {sanityIdToImageReference} from './sanityIdToImageReference'
import {sanityUploadFromUrl} from './sanityUploadFromUrl'
import {wpImageFetch} from './wpImageFetch'

// Remove these keys because they'll be created by Content Lake
type StagedPost = Omit<Post, '_createdAt' | '_updatedAt' | '_rev'>

export async function transformToPost(
  wpDoc: WP_REST_API_Post,
  client: SanityClient,
  existingImages: Record<string, string> = {},
): Promise<StagedPost> {
  const doc: StagedPost = {
    _id: `post-${wpDoc.id}`,
    _type: 'post',
  }

  // ...all other attributes!

  // Document has an image
  if (typeof wpDoc.featured_media === 'number' && wpDoc.featured_media > 0) {
    // Image exists already in dataset
    if (existingImages[wpDoc.featured_media]) {
      doc.featuredMedia = sanityIdToImageReference(existingImages[wpDoc.featured_media])
    } else {
      // Retrieve image details from WordPress
      const metadata = await wpImageFetch(wpDoc.featured_media)

      if (metadata?.source?.url) {
        // Upload to Sanity
        const asset = await sanityUploadFromUrl(metadata.source.url, client, metadata)

        if (asset) {
          doc.featuredMedia = sanityIdToImageReference(asset._id)
          existingImages[wpDoc.featured_media] = asset._id
        }
      }
    }
  }

  return doc
}
```

## Run the import with images



Once again, you can execute your import script the same way you did before. You'll notice the script taking a little longer to execute as images are uploaded. However, it should be faster on subsequent runs as re-uploads are avoided.



```sh
npx sanity@latest migration run import-wp --no-dry-run --type=posts
```

You should now see documents being created with a shape like this:



```json
{
  "_id": "post-631475",
  "_type": "post",
  "title": "From NASA’s First Astronaut Class to Artemis II: The Importance of Military Jet Pilot Experience",
  "featuredMedia": {
    "_type": "image",
    "asset": {
      "_type": "reference",
      "_ref": "image-1b007a770ea5a9902c39cf07e04cd5483ec05a7e-3405x2495-jpg"
    }
  }
}
```

As the script commits transactions, you should see new documents appear with images.



![Sanity Studio showing an image upload](https://cdn.sanity.io/images/3do82whm/next/0b2b2f98c1bc3c6ae30152cc9bdf6d561cb6bcb5-2144x1388.png)

So far, you've imported several types of documents and uploaded images. Now it's time to get into the meat of these documents: block content and rich text.



---

## Lesson 8: Converting HTML to Portable Text
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-html-to-portable-text

Migrate HTML content from WordPress to Portable Text in Sanity, gaining presentation-agnostic block content and rich querying and filtering capabilities for your structured content.

Your migration script now handles individual content fields as well as image uploads. This is essential groundwork, but we've danced around what will undoubtedly be the most complicated part—migrating from HTML into Portable Text.



This can be difficult because WordPress's HTML is stored as a string and could contain literally *anything*. It is unstructured content. And the mess you're trying to get out of. While Sanity provides tooling to smooth this process, there are bound to be rough edges depending on the quality of your existing data source content.



In this lesson, you'll import post-processed HTML from the `content.rendered` attribute in the WordPress REST API response. This string of HTML will have all functions from your WordPress installation executed. Such as "shortcodes" translated into markup.



Once complete, you'll have content stored in [Portable Text](https://www.sanity.io/learn/docs/block-content), a standard allowing you to:



- Work in a fully customizable, real-time collaborative editor with custom blocks, marks, styles, comments, and so on 

- Render block content directly as props in front end frameworks (on web and mobile)

- Query documents based on specific block content and apply filters to even the most complex structured block content and rich text


## Using WordPress blocks?



In the next lesson, you'll send an authenticated request to get access to pre-processed HTML in `content.raw` – preferable for handling content written in the Block Editor (known as Gutenberg). 



If you proceed with this lesson and import the content as-is, it will still import your final HTML markup into Portable Text; however, you will lose much control over how to serialize those blocks. Forcing you to do icky things like implementing content with `dangerouslyInsertInnerHTML`.



### Using a "page builder" plugin?



Unfortunately, if you're using a page builder such as Elementor, Divi, or Builder Beaver, you will have a bad time with the migration. While the Block tools package below can extract content from these, the HTML output can be fairly messy and tricky to navigate. 



These plugins lack a method to extract serialized content like the built-in WordPress block editor. They are more challenging to translate into structured content and almost impossible for us to reason about in a lesson. The good thing about migrating to Sanity and a modern stack is that you won't have to deal with this content lock-in again 🤞.



## HTML to Portable Text with Block tools



Earlier, you installed `@portabletext/block-tools` into the project. This package contains a function, `htmlToBlocks()`, to convert an HTML string into Portable Text.



By default, it will extract some formatting, such as headings, lists, and paragraphs, into corresponding Portable Text blocks. However, if you need to take some of the existing HTML and turn it into custom objects—like taking an image, uploading it, and creating a reference—that will require some customization.



The helper function below wraps `htmlToBlocks` and contains logic to extract the URL of any `<img>` tag found inside a `<figure>` tag. If your content field does not have images inside figure tags, you must update the script to find them.



Because the deserialize method is synchronous, the image URL is first stored in a block type `externalImage`.



In the next section, we map over each block to find an external image and use the URL to attempt to search for the image in the WordPress database. Because the image is just a string in the HTML markup, this process is not guaranteed to work. Which, while unfortunate, is an excellent demonstration of why structured content and referential integrity are so important!



## Migrating to Portable Text



In the script below, you'll find that the htmlToBlockContent takes an argument with `rules` that describe how to `deserialize` the incoming HTML to structured content in Portable Text. A rule exposes the HTML using the [HTML Node API](https://developer.mozilla.org/en-US/docs/Web/API/Node), letting you write fairly fine-grained conditional checks against its structure. This is a low-level API, so be prepared for some troubleshooting.



The script below does the following things:



1. Accept a string of HTML

2. Convert it into Portable Text

3. If it finds a figure tag, it stores the URL in an `externalImage` block

4. Then, in a throttled array of async functions, searches the WP REST API for that image based on its filename

5. If found, either use an existing image in the in-memory cache or upload the image

6. Eliminates empty blocks

7. Returns the Portable Text

- [ ] **Create** the new wrapper function to turn an HTML string into Portable Text:


```typescript:./migrations/import-wp/lib/htmlToBlockContent.ts
import {htmlToBlocks} from '@portabletext/block-tools'
import {Schema} from '@sanity/schema'
import {uuid} from '@sanity/uuid'
import {JSDOM} from 'jsdom'
import pLimit from 'p-limit'
import type {FieldDefinition, SanityClient} from 'sanity'

import type {Post} from '../../../sanity.types'
import {schemaTypes} from '../../../schemaTypes'
import {BASE_URL} from '../constants'
import {sanityIdToImageReference} from './sanityIdToImageReference'
import {sanityUploadFromUrl} from './sanityUploadFromUrl'
import {wpImageFetch} from './wpImageFetch'

const defaultSchema = Schema.compile({types: schemaTypes})
const blockContentSchema = defaultSchema
  .get('post')
  .fields.find((field: FieldDefinition) => field.name === 'content').type

// https://github.com/portabletext/editor/tree/main/packages/block-tools
export async function htmlToBlockContent(
  html: string,
  client: SanityClient,
  imageCache: Record<number, string>,
): Promise<Post['content']> {
  // Convert HTML to Sanity's Portable Text
  let blocks = htmlToBlocks(html, blockContentSchema, {
    parseHtml: (html) => new JSDOM(html).window.document,
    rules: [
      {
        deserialize(node, next, block) {
          const el = node as HTMLElement

          if (node.nodeName.toLowerCase() === 'figure') {
            const url = el.querySelector('img')?.getAttribute('src')

            if (!url) {
              return undefined
            }

            return block({
              // these attributes may be overwritten by the image upload below
              _type: 'externalImage',
              url,
            })
          }

          return undefined
        },
      },
    ],
  })

  // Note: Multiple documents may be running this same function concurrently
  const limit = pLimit(2)

  const blocksWithUploads = blocks.map((block) =>
    limit(async () => {
      if (block._type !== 'externalImage' || !('url' in block)) {
        return block
      }

      // The filename is usually stored as the "slug" in WordPress media documents
      // Filename may be appended with dimensions like "-1024x683", remove with regex
      const dimensions = /-\d+x\d+$/
      let slug = (block.url as string)
        .split('/')
        .pop()
        ?.split('.')
        ?.shift()
        ?.replace(dimensions, '')
        .toLocaleLowerCase()

      const imageId = await fetch(`${BASE_URL}/media?slug=${slug}`)
        .then((res) => (res.ok ? res.json() : null))
        .then((data) => (Array.isArray(data) && data.length ? data[0].id : null))

      if (typeof imageId !== 'number' || !imageId) {
        return block
      }

      if (imageCache[imageId]) {
        return {
          _key: block._key,
          ...sanityIdToImageReference(imageCache[imageId]),
        } as Extract<Post['content'], {_type: 'image'}>
      }

      const imageMetadata = await wpImageFetch(imageId)
      if (imageMetadata?.source?.url) {
        const imageDocument = await sanityUploadFromUrl(
          imageMetadata.source.url,
          client,
          imageMetadata,
        )
        if (imageDocument) {
          // Add to in-memory cache if re-used in other documents
          imageCache[imageId] = imageDocument._id

          return {
            _key: block._key,
            ...sanityIdToImageReference(imageCache[imageId]),
          } as Extract<Post['content'], {_type: 'image'}>
        } else {
          return block
        }
      }

      return block
    }),
  )

  blocks = await Promise.all(blocksWithUploads)

  // Eliminate empty blocks
  blocks = blocks.filter((block) => {
    if (!block) {
      return false
    } else if (!('children' in block)) {
      return true
    }

    return block.children.map((c) => (c.text as string).trim()).join('').length > 0
  })

  blocks = blocks.map((block) => (block._key ? block : {...block, _key: uuid()}))

  // TS complains there's no _key in these blocks, but this is corrected in the map above
  // @ts-expect-error
  return blocks
}
```

- [ ] **Update** your `transformToPost.ts` script to convert HTML to Portable Text and write to the `content` field


```typescript:./migrations/import-wp/lib/transformToPost.ts
import {uuid} from '@sanity/uuid'
import {decode} from 'html-entities'
import type {SanityClient} from 'sanity'
import type {WP_REST_API_Post} from 'wp-types'

import type {Post} from '../../../sanity.types'
import {htmlToBlockContent} from './htmlToBlockContent'
import {sanityIdToImageReference} from './sanityIdToImageReference'
import {sanityUploadFromUrl} from './sanityUploadFromUrl'
import {wpImageFetch} from './wpImageFetch'

// Remove these keys because they'll be created by Content Lake
type StagedPost = Omit<Post, '_createdAt' | '_updatedAt' | '_rev'>

export async function transformToPost(
  wpDoc: WP_REST_API_Post,
  client: SanityClient,
  existingImages: Record<string, string> = {},
): Promise<StagedPost> {
  const doc: StagedPost = {
    _id: `post-${wpDoc.id}`,
    _type: 'post',
  }

  // ...all your other attributes

  if (wpDoc.content) {
    doc.content = await htmlToBlockContent(wpDoc.content.rendered, client, existingImages)
  }

  return doc
}  
```

You can now run the migrations again.



```sh
npx sanity@latest migration run import-wp --no-dry-run --type=posts
```

Once the transactions are committed, you should see documents appear with populated content fields.



![Portable Text field showing rich text](https://cdn.sanity.io/images/3do82whm/next/4f60dbdd2c9275f99315a51ed7bdd2a683c8c6e3-2144x1388.png)

You might notice that once rendered in columns or other specific layouts, your existing HTML is now rendered in one column of block content.



If preserving presentation is essential, more work is required. We'll cover this in the next lesson by working with raw content from WordPress.



---

## Lesson 9: Converting WordPress blocks to Portable Text
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-wordpress-blocks-to-portable-text

Convert raw WordPress content into Portable Text, create custom schema types in Sanity Studio, and make authenticated requests to WordPress.

There might be situations where you want to preserve some of the presentational data from WordPress in your content. Sometimes, for some types of content, typically marketing landing pages with unique content that doesn't need to be resued, it's more pragmatic to migrate one-to-one.



The wrapper function from the [Converting HTML to Portable Text](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/converting-html-to-portable-text) lesson will not be wasted. What you need is the raw, *unprocessed* HTML, which contains markup from the WordPress block editor to create objects in Portable Text.



Consider a "columns" block, for example. This is a core block in WordPress. The previous lesson would have extracted text and images from its HTML and transformed them into block content without any column positioning detail (it's better to have this logic in your front end code). Trying to preserve that using class names alone from the post-processed HTML would be too difficult.



## Adding a custom column block type



Our Portable Text configuration in the Sanity Studio has no native concept of columns. You'll need to fix this first.



Register two new schema types to the Sanity Studio:



- [ ] **Create** a schema type for an array of columns


```typescript:./schemaTypes/columnsType.ts
import {defineField, defineType} from 'sanity'

export const columnsType = defineType({
  name: 'columns',
  type: 'object',
  fields: [
    defineField({
      name: 'columns',
      type: 'array',
      of: [{type: 'column'}],
    }),
  ],
})
```

- [ ] **Create** a schema type for an individual column:


```typescript:./schemaTypes/columnType.ts
import {defineField, defineType} from 'sanity'

export const columnType = defineType({
  name: 'column',
  type: 'object',
  fields: [
    defineField({
      name: 'content',
      type: 'portableText',
    }),
  ],
})
```

- [ ] **Update** your Portable Text schema type to include columns:


```typescript
import {defineField} from 'sanity'

export const portableTextType = defineField({
  name: 'portableText',
  type: 'array',
  of: [{type: 'block'}, {type: 'image'}, {type: 'externalImage'}, {type: 'columns'}],
})
```

- [ ] **Update** your workspace schema types to include columns:


```typescript:./schemaTypes/index.ts
import {authorType} from './authorType'
import {categoryType} from './categoryType'
import {columnsType} from './columnsType'
import {columnType} from './columnType'
import {externalImageType} from './externalImageType'
import {pageType} from './pageType'
import {portableTextType} from './portableTextType'
import {postType} from './postType'
import {tagType} from './tagType'

export const schemaTypes = [
  authorType,
  categoryType,
  columnsType,
  columnType,
  externalImageType,
  pageType,
  portableTextType,
  postType,
  tagType,
]
```

Your posts and pages' Portable Text fields should now have the option to add "columns."



![Sanity Studio with Portable Text editor showing a "columns" option](https://cdn.sanity.io/images/3do82whm/next/3dd76d0cc4571660e095645f4701195a9bf5e712-2144x1388.png)

## Making authenticated requests to WordPress



If you examine the output of your WordPress REST API you won't find a `content.raw` in the response. This is because it is only available when a request contains the "context" of `edit`.



You can try adding this parameter to your request – by adding `?context=edit` to the URL, you'll receive a 401 in response as that context is not publicly available.



![WordPress REST API response showing a 401 error](https://cdn.sanity.io/images/3do82whm/next/6cd192c7250fc4fd343618b4a6a1aebe32fe0a0c-2144x1388.png)

To resolve this, you'll need to add "basic authentication" to the request, which can be done with an "application password."



> [!TIP]
> Learn more about [WordPress application passwords](https://make.wordpress.org/core/2020/11/05/application-passwords-integration-guide/)



Login to your WordPress dashboard and go to wp-admin -> Users -> Edit User. Find your user account and scroll to the bottom of the page.



![WordPress dashboard showing application passwords](https://cdn.sanity.io/images/3do82whm/next/6903cdfe89fce9ff70554342d01667a36c9aa0df-2144x1388.png)

- [ ] **Create** a new application password in WordPress with any name, but be sure to copy the password.


### Update your WordPress fetch function



The `wpDataTypeFetch` function created in the [Find your WordPress API](https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/first-steps) lesson can now be updated to make an authenticated request.



- [ ] **Update** your WordPress fetch function to add authentication and a context search parameter – with your WordPress username and application password:


```typescript:./migrations/import-wp-lib/wpDataTypeFetch.ts
import {BASE_URL, PER_PAGE} from '../constants'
import type {WordPressDataType, WordPressDataTypeResponses} from '../types'

// Basic auth setup in wp-admin -> Users -> Edit User
// This is the WordPress USER name, not the password name
const username = 'replace-with-your-username'
const password = 'replace-with-your-password'

export async function wpDataTypeFetch<T extends WordPressDataType>(
  type: T,
  page: number,
  edit: boolean = false,
): Promise<WordPressDataTypeResponses[T]> {
  const wpApiUrl = new URL(`${BASE_URL}/${type}`)
  wpApiUrl.searchParams.set('page', page.toString())
  wpApiUrl.searchParams.set('per_page', PER_PAGE.toString())

  const headers = new Headers()

  if (edit) {
    // 'edit' context returns pre-processed content and other non-public fields
    wpApiUrl.searchParams.set('context', 'edit')

    headers.set(
      'Authorization',
      'Basic ' + Buffer.from(username + ':' + password).toString('base64'),
    )
  }

  return fetch(wpApiUrl, {headers}).then((res) => (res.ok ? res.json() : null))
}
```

**Important**: the script above stores a password in plain text. If you plan to commit this script to version control – or host it somewhere – consider storing and retrieving it as [environment variables](https://www.sanity.io/learn/docs/studio/environment-variables) from a `.env` file.



## Serializing raw content from WordPress



Now, your migration script can retrieve raw content. It's time to see what that looks like. Earlier in this course, you installed [@wordpress/block-serialization-default-parser](https://www.npmjs.com/package/@wordpress/block-serialization-default-parser). A library to take raw content – with all the block editor's comments and unprocessed "shortcodes" – and convert it into an array of objects.



This serialized data is much simpler to work with and convert into Portable Text. Now, you can target each individual block by its name and create whatever block content shape you like.



Deep inside "inner blocks," the content is still stored as HTML, so you will still need to use the same `htmlToBlockContent` function from the last lesson to convert that HTML into block content – but now targeting and processing content layouts like columns is much simpler.



- [ ] **Create** a helper function to process raw content from WordPress, converting paragraphs and columns into Portable Text.


```typescript:./migrations/import-wp/lib/serializedHtmlToBlockContent.ts
import type {htmlToBlocks} from '@portabletext/block-tools'
import {parse} from '@wordpress/block-serialization-default-parser'
import type {SanityClient, TypedObject} from 'sanity'

import {htmlToBlockContent} from './htmlToBlockContent'

export async function serializedHtmlToBlockContent(
  html: string,
  client: SanityClient,
  imageCache: Record<number, string>,
) {
  // Parse content.raw HTML into WordPress blocks
  const parsed = parse(html)

  let blocks: ReturnType<typeof htmlToBlocks> = []

  for (const wpBlock of parsed) {
    // Convert inner HTML to Portable Text blocks
    if (wpBlock.blockName === 'core/paragraph') {
      const block = await htmlToBlockContent(wpBlock.innerHTML, client, imageCache)
      blocks.push(...block)
    } else if (wpBlock.blockName === 'core/columns') {
      const columnBlock = {_type: 'columns', columns: [] as TypedObject[]}
      for (const column of wpBlock.innerBlocks) {
        const columnContent = []
        for (const columnBlock of column.innerBlocks) {
          const content = await htmlToBlockContent(columnBlock.innerHTML, client, imageCache)
          columnContent.push(...content)
        }
        columnBlock.columns.push({
          _type: 'column',
          content: columnContent,
        })
      }
      blocks.push(columnBlock)
    } else if (!wpBlock.blockName) {
      // Do nothing
    } else {
      console.log(`Unhandled block type: ${wpBlock.blockName}`)
    }
  }

  return blocks
}
```

## Update the migration script



- [ ] **Update** your request to WordPress to use authentication:


```typescript:./migrations/import-wp/index.ts
let wpData = await wpDataTypeFetch(wpType, page, true)
```

- [ ] **Update** your `doc.content` field to use the serialized raw content for your documents:


```typescript:./migrations/import-wp/lib/transformPost.ts
doc.content = wpDoc.content.raw
  ? await serializedHtmlToBlockContent(wpDoc.content.raw, client, existingImages)
  : undefined
```

Run your posts and pages migrations again. 



```sh
npx sanity@latest migration run import-wp --no-dry-run --type=posts
```

If your existing content used the core columns block like this:



![WordPress sample page showing a three column layout](https://cdn.sanity.io/images/3do82whm/next/eb4913c7f1963ff9930a95f0879d8685dc971ea4-2144x1388.png)

You should see Sanity documents that use the newly created Portable Text columns object like this.



![Sanity Studio showing the Portable Text editor with a columns block](https://cdn.sanity.io/images/3do82whm/next/aa5c2481505c8b6e2614bcf59cdcde8c4ae7934a-2144x1388.png)

## Enhance the editorial experience



As you notice, the block preview gives you the JSONesque data. This isn't super helpful for most content teams (unless they're into raw data). The last step is to update [the block preview](https://www.sanity.io/learn/docs/studio/previews-list-views) to show the columns a little bit nicer:



```typescript:schemaTypes/columnsType.ts
import {defineField, defineType} from 'sanity'

export const columnsType = defineType({
  name: 'columns',
  type: 'object',
  fields: [
    defineField({
      name: 'columns',
      type: 'array',
      of: [{type: 'column'}],
    }),
  ],
  preview: {
    select: {
      columns: 'columns',
    },
    prepare({columns}) {
      const columnsCount = columns.length
      return {
        title: `${columnsCount} column${columnsCount == 1 ? '' : 's'}`,
      }
    },
  },
})

```

- [ ] **Update** the columnsType field with the new preview configuration.


Your studio should now have a preview like this:



![The Portable Text editor showing a preview for the column block type saying "2 columns"](https://cdn.sanity.io/images/3do82whm/next/f2b7174fa85eab4bc5ba6abd98ba6f789e3dd569-645x364.png)

```typescript:schemaTypes/columnType.ts
import {defineField, defineType} from 'sanity'

export const columnType = defineType({
  name: 'column',
  type: 'object',
  fields: [
    defineField({
      name: 'content',
      type: 'portableText',
    }),
  ],
  preview: {
    select: {
      title: 'content',
    },
  },
})

```

- [ ] **Update** the `columnType` field with the new preview configuration.


If you click into the columns block type, you should see the first bit of content in the individual columns:



![Image](https://cdn.sanity.io/images/3do82whm/next/823d3773841377257950b3b55a5d98c643d23630-673x325.png)

You can further enhance this preview with [custom preview components](https://www.sanity.io/learn/docs/customization), using React components to show an even richer preview within the Portable Text editor.



This lesson has only scratched the surface of converting raw WordPress content into Portable Text, but you now have a plan to convert all other WordPress blocks:



1. Create a custom schema type in Sanity Studio's Portable Text editor for each new block

2. Intercept that block during serialization and convert to Portable Text

3. Make sure that the block previews are helpful for your content team


---

## Lesson 10: Restructuring content
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/restructuring-content

Take the opportunity to transform content from presentation-focused in WordPress to structured content in Sanity

This lesson has no tasks but should inspire you to do the sorts of content refactoring you can do as part of your migration. Your decisions will depend on the shape of your content and the outcomes of your re-platforming.



## Get out of “the pages lock-in"



You may have content stored in pages or posts that represent much more than what the concept of a "page" or "post" accurately describes. This is a key part of making your content reusable, and not having to copy-paste across surfaces.



In this example, imagine you have content stored in pages but separated by page template. 



- Staff profiles use a "staff" template

- Office locations use an "office" template


You can look for it in the response of a page document type in a template field:



```json
{
  "title": "Emkay Petersen",
  "template": "staff.php"
}
```

This presents a great opportunity in your transformation scripts to convert meaningful content currently trapped in presentational thinking into structured content.



## Expand your schema types



The next step is to create Sanity Studio schema types with more appropriate descriptions of that content. A more accurate type name increases the likelihood of getting more future use from your post types.



- `person` for staff profiles

- `location` for offices


With new Sanity Studio schema types created, re-run `schema extract` and `typegen generate` to create new, useful TypeScript types for your transform functions.



## Maybe pages, maybe not



Lastly, you would create a transform function that accepts the WordPress page as before, but will create new document types instead of just returning pages.



```typescript:migrations/import-wp/lib/transformToPage.ts
import {decode} from 'html-entities'
import type {SanityClient} from 'sanity'
import type {WP_REST_API_Post} from 'wp-types'

import type {Location, Page, Person} from '../../../sanity.types'

// Remove these keys because they'll be created by Content Lake
type StagedPage = Omit<Page, '_createdAt' | '_updatedAt' | '_rev'>
type StagedPerson = Omit<Person, '_createdAt' | '_updatedAt' | '_rev'>
type StagedLocation = Omit<Location, '_createdAt' | '_updatedAt' | '_rev'>

export async function transformToPage(
  wpDoc: WP_REST_API_Post,
  client: SanityClient,
  existingImages: Record<string, string> = {},
): Promise<StagedPage> {
  if (wpDoc.template === 'staff.php') {
    const doc: StagedPerson = {
      _id: `person-${wpDoc.id}`,
      _type: 'person',
    }

    doc.name = decode(wpDoc.title.rendered).trim()

    return doc
  } else if (wpDoc.template === 'office.php') {
    const doc: StagedLocation = {
      _id: `location-${wpDoc.id}`,
      _type: 'location',
    }

    doc.name = decode(wpDoc.title.rendered).trim()

    return doc
  }

  const doc: StagedPage = {
    _id: `page-${wpDoc.id}`,
    _type: 'page',
  }

  doc.title = decode(wpDoc.title.rendered).trim()

  return doc
}
```

Re-platforming is an excellent opportunity to make huge improvements to your content model and unlock content reuse far beyond just your website.



---

## Lesson 11: Conclusion
https://www.sanity.io/learn/course/migrating-content-from-wordpress-to-sanity/whats-next

You now have all the skills to convert individual post types from WordPress' presentation-focused types to structured content in Sanity Studio.

Completing your migration to Sanity is now in your hands. You'll still need to:



1. Create transformation functions for all other post types, like pages, categories, tags, and comments.

2. Create Sanity Studio schema types for any custom post types, and then make transformation functions for those as well.

3. Handle any images that could not be found inside rich text, and are now still stored as `externalImage` type objects.


And start building out your new front end(s) to support your content!



If there's anything you are unclear on or feel has been missed in this course – or you feel is still unclear – please use the feedback form below on any lesson.



Good luck!



---

## Related Resources

- [Track overview](https://www.sanity.io/learn/track/replatforming-to-sanity.md)
- [All courses and lessons](https://www.sanity.io/learn/sitemap.md)
- [Complete content for LLMs](https://www.sanity.io/learn/llms-full.txt)
