Extracting Data Objects

First published at Tuesday, 7 February 2017

This blog post has first been published in the Qafoo blog and is duplicated here since I wrote it or participated in writing it.

Warning: This blog post is more then 7 years old – read and use with care.

Extracting Data Objects

Extracting data objects from your code will make it easier to read and write, easier to test and more forward compatible. This post shows you the two most common cases where introducing a data object makes sense and how to do it.

Too Many Parameters

Every project has them, the method signatures where you just add another parameter. Query methods are a very typical example:

public function findProducts($phrase, $categories = array(), $minPrice = 0, $maxPrice = null, $productTypeFilters = array(), $limit = 10, $offset = 0) { // ... }

There are several issues with such method signatures: It is really hard to remember which parameter is at which position, additional information will require you to add even more parameters and introducing more mandatory data will even force you to change the parameter order which will most probably be a large amount of work.

Inspecting the parameters closely you can find a common pattern for most of them: 5 of 7 parameters are criteria for product search. This already reveals the name for the data object to choose:

class ProductCriteria { public $phrase; public $categories = array(); public $minPrice = 0; public $maxPrice; public $productTypeFilters = array(); public function __construct($phrase) { $this->phrase = $phrase; } }

Using this data object strips down the method signature to three parameters:

public function findProducts(ProductCriteria $criteria, $limit = 10, $offset = 0) { // ... }

It is much more readable now and it is much easier to introduce additional criteria. You can even change the structure used inside the criteria fields with some effort and without affecting the using code pieces.

It might make sense to use a base class for your data objects, as described in an earlier post, since PHP does not have native support for data objects and it can provide you with additional convenience.

Associative Arrays

Arrays in PHP are a powerful data type. Whenever there is data to be structured it is easy to just create a (potentially deeply nested) mixture of struct and list out of thin air. That makes them a really good tool for prototyping, for example:

public function getDiscounts(array $checkout) { // .... }

But once the prototyping phase is over they will soon become a real pain: There is no defined way to document array structures so the IDE will not be able to tell you which fields exist, what their purpose is and what type the fields expect. The only way to know is reading the code that creates and the code that uses the array structure. Due to the lack of auto-completion on field names there is a high risk for typos. And because it is so easy to add new fields people will eventually add whatever they need at a single place making your array more and more god like.

It is therefore a good idea to replace any associative array structure with a data object once the structure has stabilized a bit. For example:

class Checkout { /** * @var CheckoutItem[] */ public $items; /** * @var Address[] */ public $shippingAddress; // ... }

With this approach you actually solidify the structure you prototyped as an array and create sensible documentation and auto-completion support for it. In addition you raise the barrier for adding arbitrary new fields by adding one more thinking step.

Smooth Migration

In most cases a migration towards using a data object cannot be accomplished within some minutes. This only works if the method for which you are attempting to change the signature is used infrequently. If that is the case: lucky you, go ahead and perform the changes. Otherwise you should perform a smooth migration over time. You can most probably apply the following steps mechanically.

Create a New Method

Because you cannot simple change the original method signature you need a new method right beside the original one. For example:

/** * @deprecated Use calculateDiscounts() instead! */ public function getDiscounts(array $checkout) { // .... } public function calculateDiscounts(Checkout $checkout) { // .... }

The @deprecated annotation added to the original method is quite handy, because IDEs can display warnings to developers still using the old method.

Of course, having these two methods lurking around right beside each other is not nice. But remember that this is only a temporary state until you finished the refactoring entirely.

Dispatch Old Method To New

After adding the new method you probably have code duplication. To remove that, remove the body of the old method and call the new one instead, migrating the incoming array to the new data object:

/** * @deprecated Use calculateDiscounts() instead! */ public function getDiscounts(array $checkout) { return $this->calculateDiscounts(Checkout::fromLegacyArray($checkout); } public function calculateDiscounts(Checkout $checkout) { // .... }

To have the conversion from the original array to the new object in a single place I added a factory method (one of the few cases where static is OK) to the Checkout class.

Change Use Case

Now it's time to change the use-case you are working on - the place which motivated you to actually start the refactoring:

// ... calling code ... $discounts = $whereverTheMethodIs->calculateDiscounts( Checkout::fromLegacyArray($checkoutArray) );

Congrats, you finished the first step into eliminating the deprecated method from your project. :)

Iterate

You should make it a rule in your project to perform this refactoring step whenever you encounter a use of the old method. After some weeks, search your code for the (hopefully few) remaining method calls and change them. Once you reached that state you can safely remove the deprecated method.

While you are a big step further now the end is still not reached. Look through your code and find all usages of the Checkout::fromLegacyArray() methods. These are the places where the original array structure is still used. You can now start replacing these cases in a similar way as explained here.

Subscribe to updates

There are multiple ways to stay updated with new posts on my blog: