From JSON to Core Data Fast

In order to import data from JSON to Core Data there exist quite a number of ready solutions. However, neither of them fits the definition of the optimal tool we suggest to consider in this article. What we were looking for is something fast and flexible in the first place. But such tools as MagicalRecord, MagicalRecord fork, Mantle, and EasyMapping fail to provide high-speed performance. And that’s an essential requirement to the product development.

Enough talking. Let’s post some numbers!

We did a test to see how long the data import takes once you apply each of the tools by experimenting with both an empty database a database that includes objects.

Testing environment:

  • Test example
  • Build scheme: Release. Environment: Xcode 5.1, iPhone 5s
  • Unique entities: 3000, total: 3000. Entities relationship: Person <->> Phone
  • Number of runs: 5
  • Cold - import into an empty database 
  • Hot  - import of already existing objects (i.e. update)




The results show that existing tools are far from being fast.

What can we do about it?

We can fork EasyMapping!

Why EasyMapping and not any other existing framework? It turns out to be the simplest and most extensible framework as for me. The author of EasyMapping points out, that the idea he came up with grew from RestKit’s mapping and that of the Mantle’s.

However he notes: “RestKit’s problem is that it doesn't transform custom values (such as a string value to an enum)”. According to the author’s idea, in EasyMapping you don't need to inherit from any class, like in Mantle’s mapping. 

That’s why I chose EasyMapping as a platform for the performance improvement (and not only that). In general the rules that describe mapping in EasyMapping are quite simple. For more info see EasyMapping or our fork.

Shortly, the object mapping itself is described as a separate object. The same is applied to the property mapping and relationship mapping (in our fork). Therefore, by combining the property mappings and relationship mappings with the object mapping you can describe the majority of the mapping cases.

Back to performance troubles

The main drawbacks of all of the above-mentioned tools, except the fork of MagicalRecord is that they trip to database in cycles and create NSPredicate and NSDateFormatter. In order to find an existing object in the database, NSFetchRequest  is performed. What does Apple say in the CoreData guidelines? They propose to use the pattern “find” before the actual import and only after that “or create” instead of "find or create" for every iteration. This can be done by using the predicate IN:

[NSPredicate predicateWithFormat:@"primaryKey IN %@", arrayOfPrimaryKeys];

Why is it so important to do? Apple highlights that the complexity of "find or create" for every iteration turns into O(n^2), which leads to inappropriate import time, as you have seen above. So what have we done with our EasyMapping fork? Implemented Apple's recommendations, that's all. In order to implement “find before import” pattern we have to somehow collect a set of primary keys from JSON. 

Collecting primary keys and objects caching should be performed by a separate class. Let's name it EMKLookupCache.

What does EMKLookupCache need to have to be able to operate? We need to know the rules of mapping to collect the primary keys from JSON. In our case they're represented by EMKManagedObjectMapping. Definitely, we would need NSManagedObjectContext to work with.

@interface EMKLookupCache : NSObject
@property (nonatomic, strong, readonly) EMKManagedObjectMapping *mapping;
@property (nonatomic, strong, readonly) NSManagedObjectContext *context;
- (instancetype)initWithMapping:(EMKManagedObjectMapping *)mapping externalRepresentation:(id)externalRepresentation context:(NSManagedObjectContext *)context;


externalRepresentation is JSON itself.

Also, we need to somehow retrieve the cached object. Let's add the following into the @interface:

@interface EMKLookupCache : NSObject
(id)existingObjectForRepresentation:(id)representation mapping:(EMKManagedObjectMapping *)mapping;


Implementation. Init

- (instancetype)initWithMapping:(EMKManagedObjectMapping *)mapping
					 context:(NSManagedObjectContext *)context {
	self = [self init];
	if (self) {
		_mapping = mapping;
		_context = context;

		_lookupKeysMap = [NSMutableDictionary new];
		_lookupObjectsMap = [NSMutableDictionary new];

		[self inspectExternalRepresentation:externalRepresentation usingMapping:mapping];

	return self;

It is quite straightforward - you can see the allocation of the internal variables. A much more interesting part is inspection of input JSON, which is implemented here:

- (void)inspectExternalRepresentation:(id)externalRepresentation usingMapping:(EMKManagedObjectMapping *)mapping {
// 1
	id representation = [mapping mappedExternalRepresentation:externalRepresentation];
// 2
	if ([representation isKindOfClass:NSArray.class]) {
		for (id objectRepresentation in representation) {
// 3
			[self inspectObjectRepresentation:objectRepresentation usingMapping:mapping];
	} else if ([representation isKindOfClass:NSDictionary.class]) {
// 3
		[self inspectObjectRepresentation:representation usingMapping:mapping];
  1. Since the object(s) representation cannot guarantee to become the root object, we need to extract one. It's simply done by [NSObject valueForKeyPath:]
  2. To avoid the need of the client’s code to specify the expected object representation container NSArray or NSDictionary, we should automate the inspection.NSDictionary, we should automate the inspection.
  3. And that is the method that does all the job.
- (void)inspectObjectRepresentation:(id)objectRepresentation usingMapping:(EMKManagedObjectMapping *)mapping {
// 1
	if (mapping.primaryKey) {
// 2
		EMKAttributeMapping *primaryKeyMapping = mapping.primaryKeyMapping;
// 3
		id primaryKeyValue = [primaryKeyMapping mappedValueFromRepresentation:objectRepresentation];
// 4
		if (primaryKeyValue) {
			[_lookupKeysMap[mapping.entityName] addObject:primaryKeyValue];
// 5
	for (EMKRelationshipMapping *relationshipMapping in mapping.relationshipMappings) {
		[self inspectExternalRepresentation:objectRepresentation usingMapping:relationshipMapping.objectMapping];


  1. It may happen that nested mapping does not have the primary key. So we will need to filter out such mappings. As I’ve already mentioned, every mapping is represented by a separate class whether it's an object or an attribute or a relationship. In the case of @propertyEMKAttributeMapping is used.
  2. EMKAttributeMapping knows how to transform the data (i.e. NSDictionary to NSNumber by using internal properties, such as keyPath, which extracts a required value from the object representation and the map performs the actual transformation (i.e. NSString to NSNumber).
  3. In case we found a primary key, we should place it into the store divided by entity names. Why so? Our JSON potentially contains more than one entity. Therefore we need to distinguish  primary keys' sets by the entity.
  4. Since every EMKMapping (superclass of EMKManagedObjectMapping) can contain relationships, let's go through them. To summarize what we have just seen here - by traversing JSON we're collecting primary keys and storing them in NSMutableSet for each entity separately. Therefore, at the end of this method we have a full set of primary keys we can sort out from.

How does retrieval of the cached objects work?

- (id)existingObjectForRepresentation:(id)representation mapping:(EMKManagedObjectMapping *)mapping {
// 1
	NSDictionary *entityObjectsMap = [self cachedObjectsForMapping:mapping];
// 2
	id primaryKeyValue = [mapping.primaryKeyMapping mappedValueFromRepresentation:representation];
	if (primaryKeyValue == nil || primaryKeyValue == NSNull.null) return nil;
// 3
	return entityObjectsMap[primaryKeyValue];


  1. cachedObjectsForMapping returns a dictionary of the cached objects, where the key is a primaryKey.
  2. EMKAttributeMappingknows how to retrieve a requested value from the object representation. Therefore, retrieval of a primary key from the object representation is very simple.
  3. Final statement: return the cached object for the specified primary key.

Let's go deeper and reveal the implementation of cachedObjectsForMapping:

- (NSMutableDictionary *)cachedObjectsForMapping:(EMKManagedObjectMapping *)mapping {
// 1
	NSMutableDictionary *entityObjectsMap = _lookupObjectsMap[mapping.entityName];
	if (!entityObjectsMap) {
// 2
		entityObjectsMap = [self fetchExistingObjectsForMapping:mapping];
		_lookupObjectsMap[mapping.entityName] = entityObjectsMap;

	return entityObjectsMap;


  1. By passing entity name into _lookupObjectsMap we retrieve objects for the specific entity mapped into a dictionary, where the keys are the primary keys.
  2. This is the most interesting place.

Let's reveal the implementation below.

- (NSMutableDictionary *)fetchExistingObjectsForMapping:(EMKManagedObjectMapping *)mapping {
	NSSet *lookupValues = _lookupKeysMap[mapping.entityName];
// 1
	if (lookupValues.count == 0) return [NSMutableDictionary dictionary];

	NSFetchRequest *fetchRequest = [NSFetchRequest fetchRequestWithEntityName:mapping.entityName];
	NSPredicate *predicate = [NSPredicate predicateWithFormat:@"%K IN %@", mapping.primaryKey, lookupValues];
	[fetchRequest setPredicate:predicate];
	[fetchRequest setFetchLimit:lookupValues.count];

	NSMutableDictionary *output = [NSMutableDictionary new];
// 3
	NSArray *existingObjects = [_context executeFetchRequest:fetchRequest error:NULL];
	for (NSManagedObject *object in existingObjects) {
		output[[object valueForKey:mapping.primaryKey]] = object;

	return output;


  1. If there are no primary keys collected - return immediately.
  2. As Apple says: “IN predicate fits best for "find" pattern.” lookupValues contains a list of primaryKeys, that were collected during JSON inspection.
  3. Again no rocket science here. We're just turning an array of fetched objects into a dictionary with primary keys set as keys.

What's next? It turns out that objects, which are missing in our cache should be created. These objects should be added to the cache. Otherwise we get into a situation where a newly created object can be duplicated many times during its import. Let's implement it:

- (void)addExistingObject:(id)object usingMapping:(EMKManagedObjectMapping *)mapping {
// 1
	id primaryKeyValue = [object valueForKey:mapping.primaryKey];
	NSAssert(primaryKeyValue, @"No value for key (%@) on object (%@) found", mapping.primaryKey, object);
// 2
	NSMutableDictionary *entityObjectsMap = [self cachedObjectsForMapping:mapping];
	[entityObjectsMap setObject:object forKey:primaryKeyValue];


  1. Nothing special here.
  2. Simply add objects to the prefetched from the database objects.

At this point we’re done with the cache implementation.

To recap, let’s see what the import looks like:

  1. Client passes Mapping scheme and JSON to Deserializer
  2. Deserializer passes it to EMKLookupCache.
  3. Lookup cache collects all the existing primary keys using a mapping scheme.
  4. For each entry in JSON deserializer asks a lookup cache for the existing object. If not found – create one.

Almost the end of the story

I hope, you noticed an interesting trend for MagicalRecord, Mantle and EasyMapping but there wasn’t the one for MagicalRecord fork during the update (i.e. a hot test)? It turns out that update and save takes even more time than create and save. Why so? During my research on MagicalRecord performance I found an interesting comment posted by the author of MagicalRecord 2.2 fork:

“...There were also a few quick and easy wins, one example being not modifying attributes and relationships that didn't need to be changed. On updates this can remove a good number of SQLite statements as by simply touching a property forces core data to update the version number of the object. You end up seeing a bunch of "UPDATE ZENTITY SET Z_OPT = ?, Z_ENT = ?;" statements otherwise.”

In short, if we don't update object's attributes to the equal values - we can omit a bunch of sql UPDATE queries.

@implementation NSObject (EMKKVO)

- (void)emk_setValueIfDifferent:(id)value forKey:(NSString *)key {
	id _value = [self valueForKey:key];
	if (_value == value) return;

	if (_value != nil && value == nil) {
		[self setValue:nil forKey:key];
	} else if (![value isEqual:_value]) {
		[self setValue:value forKey:key];


Please see the original implementation.


The end of the story

Don't forget to RTFM and you will save billions of microseconds for CPU.

Definitely, Mantle wasn't designed to work with CoreData directly ("Mantle doesn’t automatically persist your objects for you… If you need something more powerful, or want to avoid keeping your whole model in memory at once, CoreData may be a better choice") but it gives you the ability to map JSON to objects without any headache and gracefully as ever.

The way Mantle utilizes NSValueTransformer, implements error handling and decomposes responsibilities between objects should become standard in modern Obj-C development. MagicalRecord was one of the first tools to simplify the work with CoreData. Also I love numerous categories which give you the ability to fetch objects or setup the whole CoreData stack in one line.

Why is the efficient import missing in the implementation? I really don't know.

EasyMapping was something new to me. It is light and powerful. It is worth using.

What else?

Some of you can say: “Hey what about NSPredicate and NSDateFormatter caching?”. Yes, it does make sense but not in the EasyMapping case. It turns out that here the mapping scheme is static, so NSDateFormatter / NSPredicate are used throughout the entire life cycle of the object.

4.2/ 5.0
Article rating
Remember those Facebook reactions? Well, we aren't Facebook but we love reactions too. They can give us valuable insights on how to improve what we're doing. Would you tell us how you feel about this article?
Excited to create something outstanding?

We share the same interests.

Contact us

We use cookies to personalize our service and to improve your experience on the website and its subdomains. We also use this information for analytics.