PHP Internals News: Episode 72: PHP 8.0 Celebrations!

PHP Internals News: Episode 72: PHP 8.0 Celebrations!

In this episode of "PHP Internals News" we're looking back at all the RFCs that we discussed on this podcast for PHP 8.0. In their own words, the RFC authors explain what these features are, with your host interjecting his own comments on the state of affairs.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:23

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language.

Derick Rethans 0:32

This is Episode 72. PHP eight is going to be released today, November 26. In this episode, we look back across the season to find out which new features are in PHP eight dot zero. If I have spoken with the instigator of each of these features, I'm letting them explain what this new feature is. in the first episode of this current year, I spoke with Nikita Popov about weak maps, a feature that builds on top of the weak references that were introduced in PHP seven four. I asked: What's wrong with the weak references and why do we now need to weak maps.

Nikita Popov 1:10

There's nothing wrong with references. This is a reminder, what weak references are about, they allow you to reference, an object, without preventing it from being garbage collected. So if the object is unset, then you're just left with a dangling reference, and if you try to access it you will get acknowledged sort of the object. Now the probably most common use case for any kind of weak data structure is a map or an associative array, where you have objects and want to associate some kind of data with some typical use cases are caches or other memoize data structures. And the reason why it's important for this to be weak, is that you do not. Well, if you want to cache some data with the object and then nobody else is using that object, you don't really want to keep around that cache data, because no one is ever going to use it again, and it's just going to take up memory usage. And this is what the weak map does. So you use objects as keys, use some kind of data as the value. And if the object is no longer used outside this map, then is also removed from the map as well.

Derick Rethans 2:29

A main use case for weak maps will likely be ORMs and related tools. In the next, episode 38, I discussed this trainable interface with Nicolas Grekas from Symfony fame. I asked: Nicolas, could you explain what stringable is.

Nicolas Grekas 2:45

Hello, and stringable is an interface that people could use to declare that they implement some the magic to string method.

Derick Rethans 2:53

That was a short and sweet answer, but a reason for wanting to introduce this new interface was much more complicated, and are also potential issues with breaking backwards compatibility. Nikolas replied to my questioning about that with:

Nicolas Grekas 3:06

That's another goal of the RFC; the way I've designed it is that I think the actual current code should be able to express the type right now using annotations, of course. So, what I mean is that the interface, the proposal, the stringable is very easily polyfilled, so we just create this interface in the global namespace that declare the method and done, so we can do that now, we can improve the typing's now. And then in the future, we'll be able to turn that into an actual union type.

Derick Rethans 3:39

I had a chat with Nikita Popov about union types as part of the last season in Episode 33 before PHP seven four was released, but after the feature freeze for it happened. I happen to speak with Nikita quite a lot because he does so much work improving PHP. In episodes 40 and 43, we discussed a bunch of smaller features and tweaks to the language. First up was the static return type, Nikita explains:

Nikita Popov 4:05

So PHP has a three magic special class names that's self for into the current class parent, refering to the parent class, and static, which is the late static binding class name. And that's very similar to self. If no inheritance is involved than static is the same as self, introducing refers to the current class. However, if the method is inherited. And you call this method on the child class. Then, self is still going to refer to the original class, the parent. Well static is going to refer to the class on which the method was actually called.

Derick Rethans 4:50

Next, most the class name literal on objects, which adds the following:

Nikita Popov 4:55

And that just returns to the fully qualified class name, for example, have a use statement for that class, you can get back the full name instead of the short name. I think we've had this since PHP five five. That's a great feature because it's like makes it clear whether you're referencing the class and not just some random string, and that means, for example, that the IDE refactorings can work better and so on.

Derick Rethans 5:20

In PHP seven, we touched up inconsistencies in PHP is compound variable syntax, but we missed a few cases, Nikita explains:

Nikita Popov 5:28

All of these remaining consistencies are like really really minor things and edge cases. But weirdly, all or at least most of them are something that someone that at some point ran into and either open the bug or wrote me an email, or on Twitter. So people somehow managed to still run into these things.

Derick Rethans 5:56

But syntax changes are not the only thing that needs fixing; sometimes we also get the theory slightly wrong. In this case related to inconsistencies with traits, Nikita explains again:

Nikita Popov 6:07

The problem is that traits are sometimes not self contained. So to give a specific example we have in the logger PSR. We have a trait called logger treat, which has a bunch of methods like: warning, error, info, notice, and so on. So we just simple helper methods which all call the log method with a specific log level, and this trait only specified these helper methods, but still requires the actual class to implement the log method that we usually indicate that is by adding an abstract method to the trait. You have all the methods you actually want to provide by the trait and to have a number of abstract methods that the trait itself requires to work. This already works fine. The problem is just that these methods are not actually validated, or they are all inconsistently validated. Even though the trait specifies this abstract method, you could implement it in the class with a completely different signature.

Derick Rethans 7:09

Probably one of the big new features and PHP 8.0 are attributes, certainly the most discussed new feature with various RFCs to introduce a feature, and then change the syntax a few times. It all started with Benjamin Eberlei introducing a feature in April, ran Episode 47 I asked Benjamin, what attributes are. He replied:

Benjamin Eberlei 7:30

They are a way to declare structured metadata on declarations of the language. So in PHP, or in my RFC, this would be classes, class properties, class constants, and regular functions. You could declare additional metadata there that sort of tags, those declarations with specific additional machine readable information.

Derick Rethans 7:56

At the moment, many tools are already used up block comments to do a similar thing. So I asked how attributes are different. Benjamin answered with his first proposed syntax:

Benjamin Eberlei 8:06

The idea is that we introduce a new syntax that is independent of the docblock comments. Essentially, before each declaration, you can use the lesser-than symbol twice, then the attribute declaration, and then the greater-than sign twice. This is the syntax, I've used from the previous attributes RFC. Dmitri at that point, use the syntax from Hack. And it makes sense to reuse this not because Hack and PHP are going in the same direction anymore, but because Hack at that point they introduced that they had the same problems with which symbols are actually still easy to use. And we do have a problem in PHP a little bit with that kind of sort of free symbols. We can still use at certain places, and lesser-than and greater-than at this point are easy to parse. There are a bunch of alternatives, and one thing that I would probably proposes an alternative syntax, where we start with a percentage sign, then a square bracket open, and then a square bracket close. It is more in line with our Rust declares attributes, by Rust uses the sort of the hash symbol which we can't use because it's a comment in PHP.

Derick Rethans 9:23

He already hinted at alternatives for syntax and we'll get back to that in a moment. However, the main important thing is what was inside the surrounding attribute notation, and Benjamin explains again:

Benjamin Eberlei 9:35

If you declare an attribute name, and then you sort of have a parenthesis, open parenthesis close to pass optional arguments. You don't have to use them so you can only use the attribute name. If you sort of want to tag something just, this is a validator, this is an event listener, or whatever you come up with to use attributes for. But if you need to configure something in addition, then you can use the syntax sort of looks like if you would construct a new class, except that you don't have to put the new keyword in front of it.

Derick Rethans 10:10

In the rest of that episode we spoke about how to use attributes and what the general ideas behind them were. Soon after the attributes RFC was accepted, Benjamin proposed a second related RFC to tweak some of the working that came up throughout the discussion phase. At the time of the original RFC he did not want to make any changes in order to keep the discussion focused, but there were some tweaks necessary. In Episode 64 Benjamin explains what these changes were:

Benjamin Eberlei 10:38

There was renaming the attribute class. So, the class that is used to mark an attribute from PHPAttribute to just Attribute. I guess we go into detail in a few seconds but I just list them. The second one is an alternative syntax to group attributes and yeah save a little bit on the characters to type and allow to group them. The third was a way to validate which declarations an attribute is allowed to be set on, and the fourth was a way to configure if an attribute is allowed to be declared once or multiple times, on one declaration.

Derick Rethans 11:23

All the suggested tweaks passed with ease. A contentious issue, however, was the syntax to enclose the attributes. Two RFCs later, the PHP development team finally settled on the syntax which has attributes enclosed in hash, square bracket open, and close with the square bracket.

Derick Rethans 11:44

George Peter Banyard likes tidying up things in the language. I spoke with him on several occasions this season, where he was suggesting to just do that. In the first instance he's suggesting to change PHP to use locale independent floating point numbers to string conversions. He explains the problem:

George Peter Banyard 12:02

Currently, when you do a float to string conversion. So or casting or displaying a float, the conversion will depend on like the current locale. So instead of always using like the decimal dot separator. For example, if you have like a German or the French locale enabled, it will use like a comma to separate like the decimals.

Derick Rethans 12:23

He explained what he suggested to change:

George Peter Banyard 12:26

Change more or less to always make the conversion from float to string, the same so locale independent, so it always uses the dot decimal separator. With te exception of printf was like the F modifier, because that one is, as previously said locale aware, and it's explicitly said so.

Derick Rethans 12:46

The second RFC was candidly titled saner numeric strings. I asked George, what the scope of the problem he wanted to address is.

George Peter Banyard 12:55

PHP has the concept of numeric strings, which are strings which have like integers or floats encoded as a string. Mostly that would arrive when you have like a GET request or a POST request and you take like the value of the form, which would be in a string. And the issue is that PHP makes some kind of weird distinctions, and classifies numeric strings in three different categories mainly. So there are purely numeric strings, which are pure integers or pure float, which can have an optional leading whitespace and no trailing whitespace. However trailing white spaces are not part of the numeric string specification in the PHP language. To deal with that PHP has a concept of leading numeric strings, which are strings which are numeric but ...

Derick Rethans 13:42

As you can hear the way how PHP handles these numbers in strings is extremely complicated. The fix is just as complicated, but it pretty much boils down to stop treating strings like "5elephant" as a number. In the last episode, I briefly discuss Larry Garfield's object ergonomics article where he sets out a more coherent way forwards into thinking on how to solve some more of the bigger pain points of PHP. Most related to value objects. Although he did not end up proposing any RFCs himself, Nikita Popov did take some inspiration from it. And he proposed two features for inclusion into PHP eight: constructor property promotion, and named arguments. In Episode 53 Nikita explains with constructor property promotion intends to solve.

Nikita Popov 14:29

Right now, if we take a simple example from the RFC, we have a class Point, which has three properties, x y and z. And each of those has a float type. And that's really all the class is ideally, this is all we would have to write. But of course, to make this object actually usable we also have to provide a constructor. And the constructor is going to repeat that, yes, we want to accept three floating point numbers, x y and z as parameters. And then in the body we have to again repeat that. Okay, each of those parameters needs to be assigned to a property. So we have to write this x equals x, this y equals y, this z equals z. I think for the Point class. This is still not a particularly large burden. Because we have like only three properties. The names are nice and short, the types are really short, and we don't have to write a lot of code. But if you have larger classes with more properties, with more constructor arguments, with larger and more descriptive names, and also larger and more descriptive type names. And this makes up for quite a bit of boilerplate code.

Derick Rethans 15:52

I asked: What is the syntax that you're proposing to improve this?

Nikita Popov 15:57

The syntax is to merge the constructor and the property declarations, so you declare the constructor, and you add an extra visibility keyword in front of the normal parameter name. So instead of accepting float x, in the constructor. You accept public float x. And what this shorthand syntax does is to also generate the corresponding property. So you're declaring a property, public float x, and to also implicitly perform this assignment in the constructor body so to assign this x equals x. This is really all it does so it's just syntactic sugar. It's a simple syntactic transformation that we're doing, but that reduces the amount of boilerplate code you have to write for value objects in particular, because for those commonly, you don't really need much more than the properties and the constructor.

Derick Rethans 16:58

Tying in the constructor property promotion was a slightly more controversial RFC, named arguments, which I discussed with Nikita in Episode 59. I asked him what named arguments are:

Nikita Popov 17:09

Currently if you're calling a function or a method you have to pass the arguments in a certain order. So in the same order in which they were declared in the function or method declaration. And what named arguments are, and parameters allow us to do, is to instead specify the argument names, when doing the call. Just taking the first example from the RFC, we have the array_fill function, and the array_fill function accepts three arguments. So you can call like array_fill(0, 100, 50). Now, like what what does that actually mean. This function signature is not really great because you can't really tell what the meaning of this parameter is and, in which order you should be passing them. So with named parameters, the same code would be something like array_fill( start: 0, number: 100, value: 50). And that should immediately make this call, much more understandable, because you know what the arguments mean. And this is really one of the main like motivations or benefits of having named parameters.

Derick Rethans 18:21

We also briefly touched on the main issues where the introduction of named arguments could introduce backward compatibility issues.

Nikita Popov 18:28

If you don't use named arguments that nothing is going to break. But of course, if named arguments are used with codes that did not expect them, then we can run into some issues. So that's one of the issues. And the other one is more of a like long term maintenance concern, that if we introduce named parameters, then those parameters become significant to the API. Which means you cannot rename parameter names in minor versions of libraries if you're semver compatible. Of course, you might be breaking some codes, using those parameter names. And I think one of the biggest concerns that has come up in the discussion is that this is a significant increase in the API burden for open source libraries.

Derick Rethans 19:15

Beyond the main features that we've discussed so far. PHP eight also outs a few smaller ones. For example, the non capturing catches, which I discussed with Max Semenik in Episode 58. He explains his short proposal:

Max Semenik 19:29

In current PHP, you have to specify a variable for exceptions you catch, even if you don't need to use this variable in your code. And I'm proposing to change it to allow people to just specify an exception type.

Derick Rethans 19:48

This proposal password 48 votes for, and one against. The last few major PHP releases a lot of focus was put into strengthening PHP's type system. You see that and PHP seven four with additions to OO variance rules, and in PHP eight already with union types, the stringable interface and a static return type. Dan Ackroyd explains in episode 56, why he was suggesting to ask the predefined union type "mixed".

Dan Ackroyd 20:14

I have a library for validating parameters, and due to how that library needs to work the code passes user data around a lot. Internally, and then back out to whether libraries return the validator's result. So I was upgrading that library to PHP 7.4, and that version introduced property types, which are very useful things. What I was finding was that I was going through the code, trying to add types everywhere occurred. And as a significant number of places where I just couldn't add a type, because my code was holding user data. It could be any other type. The mixed type had been discussed before, an idea that people kind of had been kicking around but it just never been really worked on. So that was the motivation for me, I was having this problem where I couldn't upgrade my library, as I wanted to, I kept forgetting: has this bit of code here, been upgraded and I just can't add a type, or is it the case that I haven't touched this bit of code yet.

Derick Rethans 21:16

When I spoke with Dan, he also mentioned that sometimes he assists with writing RFCs in case some person would benefit from some technical editing, for example, due to language barriers. In the same way I ended up speaking to Dan again in Episode 65 about a null safe operator, on which he was working with Ilija Tovilo. That explains what a feature is about.

Dan Ackroyd 21:38

Imagine you've got a variable that's either going to be an object, or it could be null, so the variable is an object, you're going to want to call a method on it, which obviously if it's null, then you can't call method on it because it gives an error. Instead, what the null safe approach allows you to do is to handle those two different cases in a single line, rather than having to wrap everything with if statements to handle the possibility that it's null. The way it does this is through a thing called short circuiting, so instead of evaluating whole expression. As soon as use the null safe operator, I want the left hand side of the operator is null, and then get short circuited and it all just evalutates to null instead.

Derick Rethans 22:18

It also gets an additional benefit related to having shorter code.

Dan Ackroyd 22:24

And having the information about what the code's doing in the code, rather than in people's heads makes a lot easier for compilers and static analyzers to their jobs.

Derick Rethans 22:35

The last big nice syntax feature in PHP eight zero is the match expression, again by Ilija Tovilo. Instead of me interviewing Dan again, I decided as a joke to interview myself on the subject. It was a little bit surreal but I think it worked out well enough, as a one off event. I first explained to myself the problem with the existing switch language construct.

Derick Rethans 22:56

So, before we talk about the match expression, we really need to talk about switch. Switch is a language construct in PHP that you probably know allows you to jump to different cases depending on the value. So you have to switch statement: switch, parentheses opening, variable name, parenthesis closes, and then for each of the things that you want to match against your use case condition, and that condition can be either static value or an expression. But switch has a bunch of different issues that are not always great. So the first thing is that it matches with the equals operator, or the equals, equals sign. And this operator as you probably know, will ignore types, causing interesting issue sometimes when you're doing matching with variables that contain strings with cases that contains numbers, or a combination of numbers and strings. So, if you do switch on the string foo, and one of the cases has case zero, then it will still be matched because it could type juggle the foo to zero, and that is of course not particularly useful. At the end of every case statement you need to use break, otherwise it falls down to the case that follows. Now sometimes that is something that you want to do, but in many other cases that is something that you don't want to do and you need to always use break. If you forget, then some weird things will happen sometimes. Anothercommon thing to use it switches that we sit on on a variable. And then, what you really want to do is the result of, depending on which case's being matched assign a value to a variable. And the current way how any student now is case, say case zero, $result equals string one, break, and you have case two where you don't set: return value equals string two and so on and so on. Which isn't always a very nice way of doing it because you keep repeating the assignment, all the time. And another but minor issue with switch is that it is okay not to cover every value with a condition. So, it's totally okay to have case statements, and then not have a condition for a specific type and switch doesn't require you to add default at the end either, so you can actually end up having a condition that would never match any case, and you have no idea that that would happen.

Derick Rethans 25:16

Before I went into details, I also explained how the new match language construct could solve some of these criticisms.

Derick Rethans 25:24

The match expression is a new language keyword, which also allows you to switch depending on a condition matching a variable. You're saying matching this variable against a set of expressions just like you would do with switch. But there's a few major differences with switch here. Unlike switch, match returns a value, meaning that you can do return value equals match, then your variable that you're matching on, and the value that gets assigned to this variable is the result of the expression on the right hand side of each condition.

Derick Rethans 26:04

That's it for the new features in PHP eight, but I haven't spoken yet about a reason why the PHP team is releasing PHP eight and not PHP seven dot five. And of course, that is PHP's new JIT engine that is slated to improve performance, quite a lot. I have some concerns of my own. And in Episode 48 I spoke with Sara Goleman, which articulated my main concerns with it more eloquently.

Sara Golemon 26:31

If you go and look at the engine, particularly the runtime pieces of the engine, although the compiler's complex as well. You have to do a lot of digging before you even get to a point that you can see how the pieces maybe start to fit together. You and I have spent enough time in the engine code that we know where to look for a particular thing like let's say that opcode you mentioned that implements strlen. We know that Zend VM def dot h has got the definition for that. We also know that that file is not real code, it's a pre processed version of code that gets built later on. Somebody coming to that blind is not going to see a lot of those pieces. So there's already this big ramp up just to get into the Zend engine, as it exists now in 7.4. Let's add JIT on top of that, you've got code that is doing call forward graphs and single static analysis and finding these tracelets, and making sense of the code at a higher level than a single instruction at a time, and then distilling that down to instructions that the CPU is going to recognize, and CPU instructions are these packed complex things that deal with immediates and indirects, and indirects of indirects, and registers, and the x86 call API is a ridiculous thing that nobody should ever have to look at. So you add all this complexity to it, that by the way, sits in ext/opcache, it's all isolated to this one extension, that reaches into the engine and fiddles around with things to make all this JIT magic happen and we're going to take your reduced set of developers who know how to work on Zend engine and you're going to reduce that further. I think at the moment it's still only about three or four people who actually understand how PHP's JIT is put together enough that they can do any effective work on it.

Derick Rethans 28:20

I am still a little apprehensive about whether the effort of introducing a JIT engine is going to pay off. I certainly hope that I'm going to be proven wrong and that the JIT engine is going to be a massive performance boost. But in the end, we do definitely need more people to understand and work on the PHP engine and a new JOT engine that is built into opcache. Perhaps that's something you yourself might want to have a look at in 2021. PHP 8 will be out later today and I hope that you're pleased with all the new features that a PHP development worked on hard throughout the year. With this I'm concluding this episode and also this year's season. I will be back in the new year with more episodes where I hope to demystify the development of the PHP engine some more. Enjoy the holidays and stay safe.

Derick Rethans 29:08

Thank you for listening to this installment of PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next year.


Type hints, SOLID programming, burn out, and more

Listen to Eric, John, and Oscar discuss the articles in the November 2020 issue, SOLID Foundations

Topics Covered

  • OSMI 2020 Mental Health in Tech Survey.
  • Practical uses for scalar type hints in PHP.
  • SOLID principles for programming.
  • The peculiarities of floating point math and handling money calculations as a result.
  • Using locks to prevent race conditions.
  • Podcasts to listen to related to PHP and web development.
  • Using __debugInfo() to keep secrets safe.
  • Preventing burn out.
  • and more (as usual)

 

The post Type hints, SOLID programming, burn out, and more appeared first on php[architect].


Growing pains, open source documentation, and cascading deletes

Jake and Michael discuss the growing pains a business can face as they scale up, creative solutions to getting markdown-based docs into a Vapor application, and cascading deletes of tens of thousands of records in MySQL.

This episode is sponsored by Fathom Analytics, simple, privacy-focused website analytics for bloggers & businesses and Workvivo, the employee communication platform for the modern workplace.

You can catch the live stream of this episode on YouTube.

Show links

214:Positive Vibes

This week on the podcast, Eric is in a truly positive mood, which might have something to do with him switching back to Vim full-time. John follows up on discussions he had with other developers around the topic of named arguments. And Tom talks security, doom and gloom and much more...

If you haven't checked it out yet, go grab this month's free article from PHP[architect] - Community Corner: Podcast—Mic Check

Links from the show

*Eric's Vim Links from this week *

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.


PHP Internals News: Episode 71: What didn’t make it into PHP 8.0?

PHP Internals News: Episode 71: What didn’t make it into PHP 8.0?

In this episode of "PHP Internals News" we're looking back at all the RFCs that we discussed on this podcast for PHP 7.4, but did not end up making the cut. In their own words, the RFC authors explain what these features are, with your host interjecting his own comments on the state of affairs.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:15

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 71. At the end of last year, I collected snippets from episodes about all the features that did not make it into PHP seven dot four, and I'm doing the same this time around. So welcome to this year's 'Which things were proposed to be included into PHP 8.0, but didn't make it. In Episode 41, I spoke with Stephen Wade about his two array RFC, a feature you wanted to add to PHP to scratch an itch. In his own words:

Steven Wade 0:52

This is a feature that I've, I've kind of wish I would have been in the language for years, and talking with a few people who encouraged. It's kind of like the rule of starting a user group right, if there's not one and you have the desire, then you're the person to do it. A few people encouraged to say well why don't you go out and write it? So I've spent the last two years kind of trying to work up the courage or research it enough or make sure I write the RFC the proper way. And then also actually have the time to commit to writing it, and following up with any of the discussions as well.

Steven Wade 1:20

I want to introduce a new magic method the as he said the name of the RFC is the double underscore to array. And so the idea is that you can cast an object, if your class implements this method, just like it would toString; if you cast it manually, to array then that method will be called if it's implemented, or as, as I said in the RFC, array functions will can can automatically cast that if you're not using strict types.

Derick Rethans 1:44

I questioned him on potential negative feedback about the RFC, because it suggested to add a new metric method. He answered:

Steven Wade 1:53

Beauty of PHP is in its simplicity. And so, adding more and more interfaces, kind of expands class declarations enforcement's, and in my opinion can lead to a lot of clutter. So I think PHP is already very magical, and the precedent has been set to add more magic to it with seven four with the introduction of serialize and unserialize magic methods. And so for me it's just kind of a, it's a tool. I don't think that it's necessarily a bad thing or a good thing it's just another option for the developer to use

Derick Rethans 2:21

The RFC was not voted on and a feature henceforth did not make it into PHP eight zero.

Derick Rethans 2:27

Operator overloading is a topic that has come up several times over the last 20 years that PHP has been around as even an extension that implements is in the PECL repository. Jan Bøhmer proposed to include user space based operator overloading for PHP eight dot zero. I asked him about a specific use cases:

Jan Böhmer 2:46

Higher mathematical objects like complex numbers vectors, something like tensors, maybe something like the string component of Symfony, you can simply concatenate this string object with a normal string using the concat operator and doesn't have to use a function to cause this. Most basically this should behave, similar to a basic string variable or not, like, something completely different.

Derick Rethans 3:16

For some issues raised during the RFC process and Jan explains to the most notable criticisms.

Jan Böhmer 3:21

First of all, there are some principles of operator overloading in general. So there's also criticism that it could be used for doing some very weird things with operator overloading. There was mentioned C++ where the shift left shift operator is used for outputting a string to the console. Or you could do whatever you want inside this handler so if somebody would want to save files, or modify a file in inside an operator overloading wouldn't be possible. It's, in most cases, function will be more clear what it does.

Derick Rethans 4:01

He also explained his main use case:

Jan Böhmer 4:04

Operator overloading should, in my opinion, only be used for things that are related to math, or creating custom types that behave similar to build types.

Derick Rethans 4:15

In the end, the operator overloading RFC was voted on. But ultimately declined, although there was a slim majority for it.

Derick Rethans 4:24

In Episode 44, I spoke with Máté Kocsis about the right round properties RFC and asked him what the concept behind them was. He explained:

Máté Kocsis 4:33

Write once properties can only be initialized, but not modified afterwards. So you can either define a default value for them, or assign them a value, but you can't modify them later, so any other attempts to modify, unset, increment, or decrement them would cause an exception to be thrown. Basically this RFC would bring Java's final properties, or C#'s read only properties to PHP. However, contrary to how these languages work, this RFC would allow lazy initialization, it means that these properties don't necessarily have to be initialized until the object construction ends, so you can do that later in the object's lifecycle.

Derick Rethans 5:22

Write once properties was not the only concept that he had explored before writing this RFC. We discussed these in the same episode:

Máté Kocsis 5:31

The first one was to follow Java and C# and require all right, once properties to be initialized until the object construction ends, and this is what we talked about before. The counter arguments were that it's not easy to implement in PHP, the approach is unnecessarily strict. The other possibility is to let unlimited writes to these properties, until object construction ends and then do not allow any writes, but positive effect of this solution is that it plays well with bigger class hierarchies, where possibly multiple constructors are involved, but it still has the same problems as the previous approach. And finally the property accessors could be an alternative to write once properties. Although, in my opinion, these two features are not really related to each other, but some say that property accessors could alone, prevent some unintended changes from the outside, and they say that maybe it might be enough. I don't share this sentiment. So, in my opinion, unintended changes can come from the inside, so from the private or protected scope, and it's really easy to circumvent visibility rules in PHP. There are quite some possibilities. That's why it's a good way to protect our invariance.

Derick Rethans 7:02

In the end this RFC was the client, as it did not wait to two thirds majority required with an even split between the proponents and the opponents.

Derick Rethans 7:11

Following on from Máté's proposal to add functionality to our object orientation syntax. I spoken Episode 49 with Jakob Givoni on a suggested addition COPA, or in full: contact object property assignments Jakob explains why he was suggesting to add this.

Jakob Givoni 7:28

As always possible for a long time why PHP didn't have object literals, and I looked into it, and I saw that it was not for lack of trying. Eventually I decided to give it a go with a different approach. The basic problem is simply to be able to construct, populate, and send an object in one single expression in a block, also called inline. It can be like an alternative to an associative array: you give the data, a well defined structure, the signature of the data is all documented in the class.

Derick Rethans 8:01

Of course people abuse associative arrays for these things at the moment, right. Why are you particularly interested in addressing this deficiency as you see it?

Jakob Givoni 8:11

Well I think it's a common task. It's something I've been missing as I said inline objects, obviously literals for a long time and I think it's a lot of people have been looking for something like this. And also it seemed like it was an opportunity that seemed to be an fairly simple grasp.

Derick Rethans 8:28

I also asked them what the main use case for this was.

Jakob Givoni 8:32

Briefly, as I mentioned, they're data transfer objects, value objects, those simple associative arrays that are sometimes used as argument backs to constructors when you create objects. Some people have given some examples where they would like to use this to dispatch events or commands to some different handlers. And whenever you want to create, populate, and and use the object in one go, COPA should help you.

Derick Rethans 9:04

COPA did also not make it into PHP eight with the RFC being the client nearly unanimously. The proposals by both Máté and Jakob where meant to improve PHP object syntax by helping out with common tasks. The implementation ideas of what they were trying to accomplish were not particularly lined up. This spurred on Larry Garfield to write a blog post titled: object ergonomics, which are discussed with him in Episode 51. I first asked him why he wrote this article:

Larry Garfield 9:33

As you said, there's been a lot of discussion around improving PHP's general user experience of working with objects in PHP, where there's definitely room for improvement, no question. And I found a lot of these to be useful in their own right, but also very narrow, and narrow in ways that solve the immediate problem, but could get in the way of solving larger problems later on down the line. I went into this with an attitude of: Okay, we can kind of piecemeal attack certain parts of the problem space, or we can take a step back and look at the big picture and say: All right, here's all of the pain points we have, what can we do that would solve, not just this one pain point, but let us solve multiple pain points with a single change, or these two changes together solve this other pain point as well, or, you know, how can we do this in a way that is not going to interfere with later development that we talked about. We know we want to do, but hasn't been done yet. Are we not paint ourselves into a corner by thinking too narrow.

Derick Rethans 10:40

The article mentions many different categories and possible solutions. I can't really sum these up in this episode because it would be too long. Although, Larry did not end up proposing RFC based on this article, it can be called responsible for constructor property promotions, which I discussed with Nikita Popov in Episode 53 and Named Arguments which are discussed with Nikita in Episode 59. Both of these made it into PHP 8.zero and cover some of the same functionality that Jakob's COPA RFC covered. I will touch on the new features that did make it into PHP 8.0 in next week's episode. There are two more episodes where discuss features that did not make it into PHP eight zero, but these are still under discussion and hence might make it into next year's PHP eight dot one. In Episode 57, I spoke with Ralph Schindler about his conditional code flow statements RFC. After the introduction, I asked what he specifically was wanting to introduce.

Ralph Schindler 11:36

This is, you know, it's, it's very closely related to what in computer science is called a guard clause. And I used that phrase lightly when I originally brought it up on the mailing list but it's very close in line to that it's not necessarily exactly that, in terms of the syntax. In terms of like when you speak about it in the PHP code sense, it really is sort of a change in the statement. So putting the return before the if, that's really what it is. So a guard clause, it's important to know what that is is it's a way to interrupt the flow of control

Derick Rethans 12:08

Syntax proposals are fairly controversial, and I asked Ralph about his opinions of the type of feedback that he received.

Ralph Schindler 12:15

The smallest changes always get the most feedback, because there's such a wide audience for a change like this.

Derick Rethans 12:23

The last feature that did not make it into PHP eight zero was property write/set visibility, which I discussed with André Rømcke in Episode 63. I asked him what his RFC was all about:

Derick Rethans 12:34

What is the main problem that you're wanting to solve with what this RFC proposes?

André Rømcke 12:40

The high level use case is in order to let people, somehow, define that their property should not be writable. This is many benefits in, when you go API's in order to say that yeah this property should be readable. But I don't want anyone else but myself to write it. And then you have different forms of this, you have either the immutable case where you, ideally would like to only specify that it's only written to in constructor, maybe unset in destructor, maybe dealt with in clone and so on, but besides that, it's not writable. I'm not going into that yet, but I'm kind of, I was at least trying to lay the foundation for it by allowing the visibility or the access rights to be asynchoronus, which I think is a building block from moving forward with immutability, read only, and potentially also accessors but even, but that's a special case.

Derick Rethans 13:39

At the time of our discussion he already realized that it would be likely postponed to PHP eight dot one as it was close to feature freeze, and the RFC wasn't fully thought out yet. I suspect we'll hear more about it in 2021. With this I would like to conclude this whirlwind tour of things that were proposed but did not make it in. Next week I'll be back with all the stuff that was added to PHP for the PHP eight zero celebrations. Stay tuned.

Derick Rethans 14:09

Thanks for listening to this installment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.



213:They Stole Steve

This week on the podcast, Eric, John, and Thomas talk a lot of PHP. We continue our discussions around what is coming in PHP 8, as well as Laravel Breeze, xDebug, and more...

Links:

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.


PHP Internals News: Episode 70: Explicit Octal Literal

PHP Internals News: Episode 70: Explicit Octal Literal

In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about an RFC that he has proposed to add an Explicit Octal Literal to PHP.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:15

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language.

Derick Rethans 0:24

This is Episode 70. Today I'm talking with George Peter Banyard, about a new RFC that he's just proposed for PHP 8.1, which is titled explicit octal literal. Hello George, would you please introduce yourself?

George Peter Banyard 0:38

Hello Derick, I'm George Peter Banyard, I'm a student at Imperial College London, and I contribute to PHP in my free time.

Derick Rethans 0:46

Excellent, and the contribution that you're currently have up is titled: explicit octal literal. What is the problem that this is trying to solve?

George Peter Banyard 0:56

Currently in PHP, we have four types of integer literals. So decimal numbers, hexadecimal, binary, and octal. Decimal is just your normal decimal numbers; hexadecimal starts with 0x, and then hexadecimal characters so, null to nine and A to F, and then binary starts with 0b, and then it's only zeros and ones. However, octal notation is just a decimal, something which looks like a decimal number, which was a leading zero, which doesn't really look that much different than a decimal number, but it comes from the days from C and everything which just uses like a zero as a prefix.

Derick Rethans 1:48

But I have seen is people using like array keys for the, for the month names right and they use 01, 02, 03, you get 07, and 08 and 09, and then they look at the arrays. They notice that they actually had the zeroth element in there but no, but no eight or nine. That's something that is that PHP no longer does I believe. No, it's mostly that the parser doesn't pick it up anymore. Instead of silently ignoring the eight, it'll just give you an error. You've mentioned that there's these four types of numbers with octal being the one started with zero. But what's the problem with is that a moment?

George Peter Banyard 2:31

Sometimes when you want to use, which looks like decimal number. So, for example, you're trying to order months, and use like the full two digits for the month number, instead of just one, you use 01, as an array key. When you get to array, it will parse error because it can't pass 08 as an octal number, which is very confusing, because it. Most people don't deal with octal numbers that often, and you would expect everything to be decimal. Because numeric strings are always decimal, but not integers literals. So, the proposal is to add an explicit octal notation, which would be 0o. So python does that, JavaScript has it, Rust also has it, to allow like a by more explicit to say oh I'm dealing with an octal number here. This is intended.

Derick Rethans 3:33

Beyond having the 0b for binary, and the 0x for hexadecimal, the addition of 0o for octal is the plan to add. And is that it?

George Peter Banyard 3:45

That's more or less the proposal. It's non-BC, because the parser before would just parse or if you had 0o, so there's no PC very possible numeric strings are not affected because since PHP 7.0 hexadecimal strings are not handled anymore as numeric strings. Numeric strings will always be decimal integers, literals will have your four different variants, and maybe a future proposal is to deprecate the implicit octal notation to always make a decimal, even if you have leading zeros.

Derick Rethans 4:21

At the moment, if I do as a string literal 014, and do an echo that I get 12.

George Peter Banyard 4:27

Because then it's interpreted as an octal. The most bizarre example is if you do var_dump string of 014 double equal to 014, you will get false, because one is interpreted as well 14, like the numeric string is interpreted as 14, whereas the octal number, which says 014 as an integer literal is interpreted as an octal number, which is 12, which is slightly confusing for most people, because that also if you because PHP, most, we all deal with like HTTP requests, and I GET and POST a data, which everything is in strings because it's a text protocol. And if you get user output, which is like I don't know, naught 14 and you're, are you intending to compare munz numbers which are or. 01201, and then you get to array, well then you just fail.

Derick Rethans 5:22

Of course, removing that support means a BC breaking change, which phones happen until PHP nine, of course, which might be a while away from now let's say that.

George Peter Banyard 5:31

Probably five years, if we're going through the timelines from PHP seven to PHP 8, but to be able to deprecated and remove it. Well, you need to add support for something else. So that's more the long term plan.

Derick Rethans 5:46

And your proposal is basically to make it equivalent to binary and hexadecimal numbers, so that it is less confusing in general.

George Peter Banyard 5:55

Yes, that's why the RFC is very short.

Derick Rethans 5:58

What are octal numbers actually used for?

George Peter Banyard 6:02

The only practical use case that I've seen is for Linux permissions, so chmod. Execute read and write, are those who permissions which chmod will use an octal number.

Derick Rethans 6:15

In a different order though but

George Peter Banyard 6:17

Yes, I don't know chmod though on top of my heart.

Derick Rethans 6:20

Is it only Linux permissions that you can think of? Is there anything else? I can't either so I'm asking you.

George Peter Banyard 6:25

No, I can't. That's why I find it very odd that like the leading zero just makes it octal instead of anything else. I mean it has precedence because many other languages do that like C, Java, I don't know, many any language I suppose was just picked it up from. I think C. But when I looked into the history, weirdly enough before C. They had a prefix for like binary, octal, and dec, and hexadecimal. But then the one for octal just got dropped, for some reason.

Derick Rethans 6:57

Maybe because the zero and the "o" next each other look very the same. We've already touched on whether there are BC breaks or not, BC standing for backwards compatibility. And, there shouldn't be any because it's something that a parser currently doesn't understand. But do other build-in extensions need to be modified for example?

George Peter Banyard 7:18

We have two extensions, which one which deals with numbers, so which is GMP, which is arbitrary precision arithmetic. And then there's the filter extension to filter octal, which filters data and tells you if it's valid or not and it gives you back a, like a correct integer or something like that, which is the filter extension, which has an octal filter. Both of these extensions have been modified to support like the prefix notation, and interpreted as a valid octal number. And then we have like the function which is oct2dec, which is basically octal to decimal, which which weirdly enough already supported like the octal prefix.

Derick Rethans 7:59

But that accepts strings, I suppose?

George Peter Banyard 8:01

Yes that that accepts strings.

Derick Rethans 8:04

And it already supported the 0o prefix?

George Peter Banyard 8:07

Yes, which is very on point for PHP I feel. Some things are just supported randomly in one side but not everywhere else.

Derick Rethans 8:15

It's a surprise for me that is what I can say. So, yeah, you mentioned as a short RFC, you think there will be any extensions to this in the future? You already mentioned having it maybe deprecating the current just zero prefix?

George Peter Banyard 8:31

So one other possible future scope is with the prefix to reintroduce octal, binary, and hexadecimal numbers. As with the prefixes as numeric strings. If you type, 0xAABBCC in, and you have that as a string, which could be useful if you get like colorus back from, from a webform, that would be automatically converted into an integer, or not automatically converted if you do like if you compare it to numbers, or if you cast it to an integer, because currently if you get 0x, something and you cast it to an integer, you will get zero. So that way you need to use like a function like hex2dec, or oct2dec, or bin2dec to convert from a string, or to another string and then cast that. Or it may be cast directly to an integer, I'm not exactly sure. But that's also debatable if it's something we want to add.

Derick Rethans 9:37

Is it actually possible to do, for example with hexadecimal numbers, do like if you have inside a string. Can you do xAA, does that actually work?

George Peter Banyard 9:48

I didn't think so.

Derick Rethans 9:49

That actually works. You can do var_dump("x6A") and it gives you the letter J.

George Peter Banyard 9:55

The more, you know.

Derick Rethans 9:56

But it doesn't work for binary, or octal. Only for hexadecimal with x. So I guess that's something that could be added to string interpolation at some point.

George Peter Banyard 10:07

PHP is so weird sometimes.

Derick Rethans 10:10

Yes, I mean PHP does things in its own way, however, making this kind of small changes to it, just end up improving the language step by step and that is of course the way forward. Right.

George Peter Banyard 10:23

Yeah.

Derick Rethans 10:25

And I'm looking forward to more of these small incremental changes in the future as well.

George Peter Banyard 10:30

Seems like a good plan.

Derick Rethans 10:32

Are you planning any more?

George Peter Banyard 10:34

Well, so I went through some of the old RFCs, most notably the one about when the whole scalar type thing was going on. We had like strict types and then we had like the coercive types. One which was by Dmitri, Zeev, pretty sure Stas, and um forgot, forgetting somebody else. But some of them, some of the ideas they had, which was making some of the type juggling more strict, so float to integer conversions. Currently, even if the floating number has like decimal part, it will just truncate it to an integer, and it won't emit any warning and it will just like pass without any issue, I think that may be is kind of unexpected. I made the other warning to that to possibly make it a type error in the future.

Derick Rethans 11:24

You mean upon a cast?

George Peter Banyard 11:26

If you've type hint function as accepting only integers, so if you say foo(int $bar), and you pass it the float. And you would like in normal mode, it will truncate, and it will just pass an integer.

Derick Rethans 11:40

Because it's just typed.

George Peter Banyard 11:42

Yes, and we've had multiple reports of people being very confused about why it's just truncating the numbers, because it's not even rounding up. If you had like if you have like 0.9 it won't round up to one it will just truncate to zero, which a lot of people are confused by.

Derick Rethans 11:58

In strict mode doesn't do that?

George Peter Banyard 11:59

Yeah, because strict mode is very strict and will only allow you to pass explicitly what's been what you've requested, with the exception of the normal integer to float conversion which is lossless.

Derick Rethans 12:12

That's lossless up to a certain point yes.

George Peter Banyard 12:14

To a certain point like your integer doesn't fit, then it goes overflow to a float.

Derick Rethans 12:19

All right. George thank you very much for taking your time this afternoon to talk to me.

George Peter Banyard 12:23

Thank you for having me.

Derick Rethans 12:26

Thanks for listening to this installment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


212:PHP 8 Deep Dive

 

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.