Exporting a video in iOS to reduce size and ensure maximum client compatibility

Movies that lay on their side

Have you ever taken a Portrait oriented movie with your iPhone or iPad and sent it to your friends who were using Windows?   I can understand their frustration when the video I so carefully recorded causes them to bend their neck to the side:

rotated_portrait

So what’s going on here?  The problem is that the .MOV file that I send them has a 90 degree rotation set as the preferred transform.   Some Windows clients (such as Windows Media Player, VLC media player, etc) do not take this transform into consideration when presenting the video.

The VideoExport Swift Project

I wanted to create a project that would allow me to address the issue with rotations and other compatibility concerns.

I published my results to scottcarter/VideoExport on Github.

The project uses a combination of Swift and Objective-C.   This article will discuss some of the considerations behind VideoExport.  Detailed project notes can be found within the code comments and in particular within the README.h header file.

Exporting the .MOV file

I set for my goal the following criteria for an exported video:

  • Dimensions reduced to no greater than 512 x 288 (16:9) or 480 x 360 (4:3)
  • Rotation (preferred transform) of container set to 0 degrees (CGAffineTransformIdentity).
  • Video profile level set to AVVideoProfileLevelH264Baseline30 (Baseline@L3.0)
  • Video bitrate reduced to average of 725,000 bps
  • MP4 container

This set of criteria all relate to either reducing the file size or providing maximum compatibility across clients.   I’ll discuss each of them separately.

Dimensions of exported video

I started looking at some restrictions imposed by various services and clients.  I noted that both MPMoviePlayerController (the iOS movie player) and the Dropbox state their support for video up to 640 x 480 (such as taken by an iPhone’s front facing camera).  I decided to actually go below this down to 480 x 360 for 4:3 ratio videos.

One consideration that comes into play is something called the macroblock division which is the even divisibility of the dimensions.   A macroblock division of 16×16 pixels is ideal with 8×8 being second best.   It is important to choose one of these divisions in order to reduce the impact to the rendering processor.

There is an excellent paper that discusses this and many other helpful topics:

Mobile Encoding Guidelines for Android Powered Devices
Addendum to Video Encoding Cookbook and Profile Guidelines for the Adobe Flash Platform
By Maxim Levkov, Adobe Systems Inc.

You’ll note that 480×360 is one of the recommended sizes for 4:3 listed in Tables 4 and 5.    For 16:9 ratio movies (such as those taken with an iPhone’s rear facing camera), I chose the size 512 x 288. This has an even better macroblock division of 16×16.

Rotation

I will be exporting the video with a preferred transform set to CGAffineTransformIdentity (rotation of 0 degrees) to avoid any issues with clients not respecting this property.   What this means is that I will need to recognize when the source video has a rotation of 90 or 270 degrees and swap the width and height in that case for the exported video.

Video Profile Level

Both the MPMoviePlayerController and the Dropbox state support for H.264 Baseline Profile Level 3.0 video.    Additionally I discovered that this profile level should also be used to ensure support across a wide range of Android devices.   I was not able to get video to play on the Android simulator unless this profile level was used.

The Android Supported Media Formats document specifically only states support for the Baseline Profile when using H264 encoding with MP4.

Video bitrate

The video bitrate has a direct effect on the resulting file size.  I settled on a bitrate of 725kbps based on some research from the following references:

AVAssetExportSession
An AVAssetExportSession object can be used to transcode the contents of an AVAsset source object.  You can use various presets to simplify this process including AVAssetExportPresetMediumQuality which produces the following output:

Video: 713kbps, 30fps.
Audio: Bit rate=64.0kbps, Sampling rate=44.1khz, 1 channel
Overall bitrate = 780kbps.

Bitrate and video optimization article
What bitrate should I use when encoding my video?
How do I optimize my video for the web?
Video: 480 x 360 x 30fps x 2(average motion) x .07 = 725kps (video bitrate)
Audio: Mono, 16 – 24 kbps rate, 22.05 kHz acceptable for speech.
Overall bitrate = ~749kbps.

Adobe bitrate calculator
Robert Reinhardt’s Flash video (FLV) bitrate calculator
For 480 x 360:
Video: 753kbps,
Audio: Mono, Medium quality = 48.0kbps, 44.1khz
Overall bitrate=801kbps

MP4 Container

I chose MP4 as my container for maximum compatibility across clients. The following article discusses the MOV and MP4 file formats:

Difference Between MOV and MP4

Meeting the Project Criteria

I looked at three approaches for implementing the project to meet the criteria that I established:  AVAssetExportSession, AVAssetReader/AVAssetWriter, SDAVAssetExportSession

AVAssetExportSession

Using AVAssetExportSession I can:

  • Use a videoComposition
    a.  Set the rotation to 0
    b.  Scale width/height as needed.
  • Set a preset which allows me some control over the resolution, profile level and bit rate (AVAssetExportPresetMediumQuality comes closest)
  • Specify MPEG4 output format.
  • Add metadata.

What I can’t do is:

  • Specify the exact frame rate I want (30 fps).  The preset apparently overrides the videoComposition.frameDuration  (AVAssetExportPresetMediumQuality uses 26.087 fps)
  • Specify the exact video bitrate I want (AVAssetExportPresetMediumQuality uses 706 Kbps)
  • Specify an Android compatible profile level with medium quality preset.

AVAssetReader/AVAssetWriter

Using AVAssetReader and AVAssetWriter allows me some added flexibility over AVAssetExportSession, but I need to use AVAssetReaderVideoCompositionOutput to achieve what I’m looking for.  If I instead use AVAssetReaderTrackOutput I can:

  • Set the video average bit rate.
  • Set my profile level which allows me to be compatible with Android.
  • Specify the width and height for output.
  • Specify MPEG4 output format.
  • Add metadata.

What I can’t do with AVAssetReaderTrackOutput is:

  • Select the frame rate and make it constant.  I found it to be variable from 25 – 30 fps.
  • Force the rotation to be 0 and swap width/height.

Note that both AVAssetReaderVideoCompositionOutput and AVAssetReaderTrackOutput are subclasses of AVAssetReaderOutput.

SDAVAssetExportSession

What I really need is to use AVAssetReaderVideoCompositionOutput in combination with AVAssetReader and AVAssetWriter.   This is tricky to setup and get right, but thankfully all the hard work has been done with the project SDAVAssetExportSession.

SDAVAssetExportSession is a replacement for AVAssetExportSession which provides all the  flexibility of that class (notably the ability to specify a videoComposition) along with the use of AVAssetReader/AVAssetWriter under the hood.

SDAVAssetExportSession allows me to meet all the criteria I established for the VideoExport project.

Audio

I didn’t need to spend a lot of time playing with audio settings for the export.  SDAVAssetExportSession makes it easy to specify what I was looking for which was:

  • Single channel (mono)
  • Bit rate = 64.0 Kbps
  • Sampling rate = 44.1 KHz

Example Conversion

It’s useful to see an example of what I was able to achieve with an export.

Source input file
MPEG-4 QuickTime .mov
Size = 44.5MB

Video
1920 width x 1080 height (16:9)
Rotation = 90   (This was a Portrait oriented video)
Profile: Baseline@L4.1
Bit rate = 21.3 Mbps
Frame rate mode = Variable
Frame rate = 29.970 fps

Audio
Bit rate = 64.0 Kbps
Sampling rate = 44.1 KHz

Exported output file
MPEG-4 Base Media / Version 2 .mp4
Size = 1.76MB

Video
288 width x 512 height (9:16)
Rotation = 0
Profile: Baseline@L3.0
Bit rate = 778 Kbps
Frame rate mode = Constant
Frame rate = 30.000 fps

Audio
Bit rate = 64.0 Kbps
Sampling rate = 44.1 KHz

Issues

I encountered a couple of Xcode bugs as I developed the project.

Simulator hang bug

Some videos cannot be processed on the simulator, but will work fine on a real device.

This is an Xcode bug that has been documented in the following article: Using a video composition to process certain videos causes an Xcode simulator hang

A bug report has been filed with Apple. A project demonstrating the bug can be found at scottcarter/VideoCompositionSimulatorBug on GitHub.

There is no known workaround.

CMTime bug

There exists a bug on the simulator wherein a CMTime structure can get corrupted when it is passed from Swift to Objective-C. It occurs only on the iPhone 4S and iPhone 5 configurations. The bug has been documented in the article: Xcode simulator bug with Swift to Objective-C call passing CMTime structure

A bug report has been filed with Apple. A project demonstrating the bug can be found at scottcarter/CMTimeBug on GitHub.

A workaround exists and has been implemented in the method getVideoComposition() in ViewController.swift.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s