ePub Author Question – What Are the Parts of an ePub File?

Let’s open up the hood and see what’s inside an ePub.

The first thing to know is that an ePub file is actually a compressed collection of files, just like a .zip file. In fact, if you make a copy of an ePub file and change the ePub’s file extension from .epub to .zip, you would have the following .zip file that can be unzipped to extract the contents so we can view them: 

An ePub file and a copy of the file with the file extension changed from .epub to .zip

An ePub file and a copy of the file with the file extension changed from .epub to .zip

We can now unzip the .zip file and view its contents. After unzipping, we see that an .epub file consists of the following two folders ( the OEBPS folder and the META-INF folder) and one file (the mimetype file):

The main three parts of an ePub file: two folders (the OEBPS folder and the META-INF folder) and one file (the mimetype file).

The main three parts of an ePub file: two folders (the OEBPS folder and the META-INF folder) and one file (the mimetype file).

 
If we open the META-INF folder, we can see that it has one file (the container.xml file) as follows: 

The one file in the META-INF folder - the container.xml file.

The one file in the META-INF folder - the container.xml file.

The container.xml file provides the location of the content.opf file as shown in the following image. The content.opf file contains important information such as the epub’s metadata (author name, published date, etc.), manifest (a list of every item in the epub file), and the spine (the order in which items are viewed as the reader scrolls through the epub). The content.opf file will be discussed shortly.

There will be additional lines of code in the container.xml file if encryption or digital rights management has been added to the ePub file. The container.xml file has been opened up below in the text editor Notepad++, which works well on a PC. You might use a text editor such as Text Wrangler if using a Mac.

The container.xml file provides the location of the content.opf file.

The container.xml file provides the location of the content.opf file.

Below is the mimetype file opened in Notepad++. The sole purpose of the mimetype file is to indicate that this is an ePub file.

The mimetype file has just one line which states that the file is an epub file.

The mimetype file has just one line which states that the file is an epub file.

 
Clicking on the OEBPS (Open eBook Publication Structure) folder reveal the following three folders (the Images folder, the Styles folder, and the Text folder) and two files ( the content.opf file and the toc.ncx file):

OEBPS folder's three folders (the Images folder, the Styles folder, and the Text folder) and two files ( the content.opf file and the toc.ncx file).

OEBPS folder's three folders (the Images folder, the Styles folder, and the Text folder) and two files ( the content.opf file and the toc.ncx file).

 

Opening the content.opf file in Notepad++ reveal three main parts of this file. The first part of the content.opf file shown below contains all of the metadata (author name, publication date, etc.) for the ePub file. The second part of the content.opf is the manifest for the entire ePub file. Every item in the entire ePub file is listed in the manifest. 

The content.opf file's metadata section and the manifest section.

The content.opf file's metadata section and the manifest section.

The third part of the content.opf file is the spine. The spine, shown below, provides the order in which the parts of the ePub file will be viewed as the reader scrolls through the ePub eBook. 

The content.opf file's spine section.

The content.opf file's spine section.

If we open up the toc.ncx file in Notepad++, we can view the contents of the ePub’s built-in navigational table of contents as follows: 

The toc.ncx file showing the built-in navigational table of contents.

The toc.ncx file showing the built-in navigational table of contents.

 Clicking on the Text folder reveals the collection of XHTML files that are the contents of the ePub eBook. Each XHTML file is a single section of the eBook.

The Text folder's XHTML files. Each XHTML file is a separate section of the ePub eBook.
The Text folder’s XHTML files. Each XHTML file is a separate section of the ePub eBook.

Opening up one of these XHTML files (New_Manuals.xhtml) shows the XHTML code. This is the same code that appears on web pages. An ePub file is just like a mini web site. One line of code contains a hyperlink and the last line links to an image, just like the HTML on a web page. 

The XHTML code of one of the sections of an ePub file, just like a web page.

The XHTML code of one of the sections of an ePub file, just like a web page.

If we open up any of the XHTML files in a web browser, it will open up just like a web page. We will open the above file (New_Manuals.xhtml) in the web browser Firefox and we’ll see that it views just like a web page, as shown below. This demonstrates how similar an ePub file is to a web site. In fact, the best tool to create an ePub is an HTML editor used to build web sites such as Dreamweaver or Microsoft Expression Web (my favorite).

Opening one of the ePub file's XHTML files in the web browser Firefox. This shows how similar an ePub file is to a web site.

Opening one of the ePub file's XHTML files in the web browser Firefox. This shows how similar an ePub file is to a web site.

 

Clicking on the Styles folder shows a CSS style sheet (stylesheet.css). The Styles folder will always contain at least one CSS style sheet. There can be more than one. Opening stylesheet.css in Notepad++ shows the CSS styles in this style sheet which control all formatting and styling in the XHTML pages. 

The CSS style sheet that controls all formatting and styling in this ePub document.

The CSS style sheet that controls all formatting and styling in this ePub document.

 The Images folder contain all of the images (jpegs, gifs, or pngs) in the ePub document as shown below:

All of the image file within the ePub document.

All of the image file within the ePub document.

 

Now you see how it all fits together and how an ePub document is very similar to a mini web site.

About these ads

6 thoughts on “ePub Author Question – What Are the Parts of an ePub File?

    • The only important item is the “mimetype” file. It must be the first file in the root directory. With the format shown it is the only file in the root directory (META-INF and OEBPS are folders). you’re home free!

  1. Pingback: What Are the Parts of an ePub File? | Digital Publishing Guide

  2. Great information! I’m having problems with Apple uploading my ePub file to the iBookstore (keeps on failing, even though it’s a valid ePub file). This article helped me understand the make-up of an ePub file, and knowing how an ePub file is made, what it contains, and the relationship of the files within an ePub file, I might have figured it out. Apple Support is like Microsoft Help–it doesn’t help! They can’t tell me what is wrong with the file, their doccumentation does not say what’s wrong, what the log file that gets generated means and how to fix the issue, and frustration abounds. What might be the problem is when creating the ePub file, the original file might have to have all spaces and special characters removed from the file name (I had spaces in the file name…). Have to verify that…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s