Some of PHP's problems
Last Update: 2023-11-12
First of: What even is PHP?
PHP is a widely used programming language in the Web. Wordpress & plugins are built on PHP. Another cool thing about PHP is, that can just throw a PHP script onto a typical webserver and it will just work.
The language's type system itself is a mix between ECMAScript, TypeScript and Java. PHP 8.2 even deprecated dynamic properties (though you can still opt-in on a class by class basis). It has a package manager called composer and the package registry is packagist.
Problems
PHP tutorials are misleading
PHP was created to make HTML more interactive. That's why every PHP file has to start with <?php
to indicate that you're using PHP code, because you could just rename a .html
file to a .php
file.
PHP tutorials like to spin that as typing normal HTML with special tags.
So we have an HTML file like this:
<html>
<head>
<title>PHP Test</title>
</head>
<body>
<p>Hello, World</p>
</body>
</html>
and transform it to this:
<html>
<head>
<title>PHP Test</title>
</head>
<body>
<?php echo '<p>Hello, ' $_REQUEST['name'] . '</p>'; ?>
</body>
</html>
That's cool. Now the webpage says Hello, Fionn, if the GET or POST parameter name is set to Fionn.
The issue is that beginners might think: Oh yeah, that's just like a template. But PHP doesn't escape special HTML symbols, so you could do a XSS injection here.
You have to escape the input parameter here with htmlentities
, like this:
<html>
<head>
<title>PHP Test</title>
</head>
<body>
<?php echo '<p>Hello, ' . htmlentities($_POST['name']) . '</p>'; ?>
</body>
</html>
Like in a video from Programmers are also human said:
The web1 wouldn't have existed without PHP and the web2 wouldn't be as cautious of security issues.
This is mainly a problem with existing beginner tutorials, but bleeds into actual production code.
Also stay clear of the HTML with special tags stuff that PHP offers. Use an actual templating language.
PHP's weird type coercion
While ECMAScript developers might complain that 0.1 + 0.2 == 0.3
results in false
, that is perfectly normal, because you're using floating-point numbers. In PHP md5('240610708') == md5('QNKCDZO') results in true
. And not because of hash-collisions.
This is because md5('240610708')
is 0e462097431906509019562988736854
and md5('QNKCDZO')
is 0e830400451993494058024219903391
. Notice that there are just numbers behind the 0e
? PHP notices too. And converts it. To a floating-point number. Both floating-point numbers are 0. This results in 0 == 0
, which is true
. So PHP just implicitly converted a string to a float, even if both sides are strings. Debug this in production. I dare you.
You can disable some PHP's type coercion with declare(strict_types=1);
. Though, your coworkers might not like you using this and it won't fix the md5('240610708') == md5('QNKCDZO')
issue, so use ===
(or !==
) instead.
There's another example of this, that I just found recently:
PHP_INT_MAX + 1
On a 64-bit system that is 9.2233720368548E+18
. We went from an int to float. Just great.
Functions as values
Let's square all values in of an array.
Example with arrow functions:
array_map(static fn($x) => $x * $x, [1, 2, 3])
Example with anonymous functions:
array_map(function($x: int): int {
return $x * $x;
}, [1, 2, 3]);
Or do this:
function square(int $x): int {
return $x * $x;
}
array_map('square', [1, 2, 3])
So yeah, I would prefer
static fn($x) => square($x)
instead of that. Your LSP server and tooling will thank you.
PHP's arrays
PHP provides arrays, which are actually ordered maps.
So if you're expecting an actual array (a map with 0..n
keys), this contract is in comments only.
This also created interesting problems which manifest in the standard library. Because sometimes you want the keys to be preserved and sometimes not.
Example:
// [0 => 0, 1 => 1, 2 => 2, 3 => 3, 4 => 4]
$ary = range(-1,4);
// [0 => 0, 2 => 2, 4 => 4]
$result = array_filter($ary, static fn($x) => $x % 2 === 0);
So you have to be cautious of that. It doesn't help that PHP calls that stuff an array.
In PHP arrays are also not objects, which means you must call array_map($fn, $ary)
instead of $ary->map($fn)
.
Arrays behave more like values like integers in other languages. If you modify them, you don't modify an object (with a pointer) but modify them in the current context (stack).
Example:
$ary = [1, 2, 3];
function add_four($ary) { $ary[] = 4; }
add_four($ary); // $ary is still [1, 2, 3]
You can do that with references:
$ary = [1, 2, 3];
function add_four(&$ary) { $ary[] = 4; }
add_four($ary); // $ary is now [1, 2, 3, 4]
This is just a insignificant detail, but I thought it was worth mentioning.
And did you know that current
, each
, end
, next
or prev
can really make your coworkers day miserable? So don't use them.
But PHP's arrays are also a happy accident, because they're great for writing JSON.
Typesystem without commitment
Modern PHP tries to have a better type-system story with each version (e.g. type declarations in properties).
But it lacks significant features like templates (or something like it) and:
- arrays (read: maps) cannot be specified except that it is an array
- Generators (yield) cannot be specified further
Variables can be declared on the fly, without any complicated syntax like (type|let|const|var) varname = val, but with just $varname = val. As PHP variables aren't typed, their type can always completely change with a new assignment.
And yes some of that can be somewhat further specified with a good LSP and PHPDoc.
Serverless
PHP is what many call today serverless (in 1993!), though it is typically hosted on just one server and not globally replicated on a CDN.
This creates several problems:
- no caching in memory (you have use external caching services like APCu or memcached)
- WebSocket support? Where? (Ratchet apparently died and Openswoole is an uncommon extension)
- Short application lifespan, which doesn't allow PHP's JIT to scale well even when using OPcache
- PHP's ability to create global definitions changes the environment which makes optimizations over several requests more difficult
Additionally the PHP interpreter isn't supported by Big Tech like ECMAScript (or Python), which means that there isn't the budget to do as many magic optimizations like V8 has.
Performance
Improving the performance of a PHP program as a PHP dev is hard due to several different reasons:
- it's an interpreted language
- use cases that aren't supported by the standard library are slow
- no async-await support
- no escape hatch like WebAssembly in ECMAScript
I actually wrote a full-blown lexer, parser, interpreter and compiler for a small company-specific template language. That experience taught me, that writing readable code is sometimes the polar opposite of writing good performing PHP code.
Small complaint corner about try catch
This might out me as a Rust-guy, but try-catch statements are just the worst sometimes. In PHP you can always throw an exception and don't even have to care about that. At least in Java you can somewhat force the caller to know about your exception.
Summary
PHP is a weird language with messy and contradicting ideas, but easy to read. I sometimes just feel like developing something in C without the performance benefits.
Currently, a real deal-breaker can be no async-await support (as in concurrency). I hope this changes.
Though, recent changes like adding Enumerations
with match
statements really catapult the language forward. You might just be able to actually use that in production in half a decade (which isn't an entirely unrealistic timeframe).