Skip to content

Correct manipulation of generated HTML in TYPO3

It shouldn't be necessary.... but sometimes it is. For some reason your project requires to manipulate the final HTML content generated by TYPO3. Also known as post-processing. This articles explains how to do this correctly in modern TYPO3 versions.

About middlewares

Since the introduction of the PSR-15 middlewares in TYPO3 [1], it is really easy to mangle the response that is generated by the Core in all its facettes. Formerly, there was the infamous hook in TSFE to manipulate the HTML content just before it got sent to the client browser. This hook is now obsolete as with neat middleware the same task can be achieved in a much cleaner way. Not to mention the vastly enhanced possiblities of manipulates besides the actual content.

Implementing a middleware that does some HTML processing itself might be pretty straight forward. What could be a bit trickier is the correct way of registering the middleware in the system. The middleware system uses the dependency ordering service (see our dedicated article about this handy thing) to determine the order how middlewares are executed. Of course it is very important to put your middleware in the right position in this order, otherwise "strange" things might happen.

Example

Lets dive right into an example and see what's going on. In this example we assume you have a sitesetup extension with a proper composer.json file including the PSR-4 path definitions.

Middleware Class

my_ext/Classes/Middleware/ChangeTheName.php:

<?php declare(strict_types=1);

namespace Reelworx\MyExt\Middleware;

use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;
use TYPO3\CMS\Core\Http\NullResponse;
use TYPO3\CMS\Core\Http\Stream;

class ChangeTheName implements MiddlewareInterface
{
    public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface
    {
        // let it generate a response
        $response = $handler->handle($request);
        if ($response instanceof NullResponse) {
            return $response;
        }

        // extract the content
        $body = $response->getBody();
        $body->rewind();
        $content = $response->getBody()->getContents();

        // the actual replacement
        $content = str_replace('Christmas', 'Easter', $content);

        // push new content back into the response
        $body = new Stream('php://temp', 'rw');
        $body->write($content);
        return $response->withBody($body);
    }
}

This class contains all the magic to replace all occurrences of the word "Christmas" with "Easter", no matter where it appears in the output. It consists of 4 distinct parts:

  1. Pass the processing on to the subsequent handlers (middlewares) in order to retrieve the actual reponse of TYPO3.
  2. Extract the content from the response object.
  3. Do your thing and mangle the content.
  4. Craft a new Stream object with the new content and use it as body for the response.

Middleware registration

The challenging part is to now find the right place for your middleware. The Configuration module in the backend shows you a list of all registered middlewares for frontend-requests and it is up to you to find the sweet spot. It is tough thing to do, since you need to know pretty much in detail what each existing middleware does, whether it matters for yours and if it does, whether it needs to go before or after yours.

To you help you out with this question, the following registration shows the right way for this example of content manipulation.

my_ext/Configuration/RequestMiddlewares.php:

<?php

return [
    'frontend' => [
        'rx/name' => [
            'target' => \Reelworx\MyExt\Middleware\ChangeTheName::class,
            'after' => [
                'typo3/cms-frontend/content-length-headers'
            ],
        ],
    ]
];

We declare our middleware to be loaded after the "Content Length Headers"-Middleware. This is extremely important as you can read below. Our middleware changes the HTML output, therefore the HTTP content-length header needs to be adjusted as well. To achieve this, we need to mangle the content, before the header is calculated by the Core's middleware. It might be a bit of a brain twist, but we need to register our middleware after the content-length header in order to be actually executed before it. The reason is that both middlewares actually do postprocessing of the response object and therefore the execution order is implicitely reversed.

Pitfalls

Yes, I managed to get this middleware registration wrong a couple of times. The consequences range from "immediate full system crash" to "barely noticed it".

Full system crash

If you manage to get the before and after parts of the registration wrong somehow, you might end up creating a cycle in the ordering definition. The Core will "happily" tell you that after the next cache flush whenever the next request hits the system. An exception with a pretty long detail output about your misconfiguration will be part of this full stop. You will have pretty hard times figuring out why your relation definition causes a cycle, having so many middlewares in place. But at least you got a clear message, you screwed it up.

Barely noticed it

Things run fine most of the time. Browsers do not complain. Automated tests run as well. No problems to expect, right?

Until a new test is added (e.g. end-to-end with cypress.io) which fails with rather weird error messages about not being able to reach the page. Trying the page in the browser works flawlessly of course. The inner error message starts telling something about not being able to parse the HTTP response, whilest the output printed with the error looks okay.

Turns out that a wrong registration of a content-mangling middleware caused it to be executed after the content-length header had been set. This results in an invalid HTTP response, which is obviously tolerated by browsers, but pretty much fatal for any other HTTP implementation.

Fixing the test is then "just" a matter of getting your middleware registration correct. And things go green.

[1] https://docs.typo3.org/m/typo3/reference-coreapi/10.4/en-us/ApiOverview/RequestHandling/Index.html

This information applies to TYPO3 10 LTS.