PHP Internals News: Episode 61: Stable Sorting

PHP Internals News: Episode 61: Stable Sorting

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about his Stable Sorting RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:18

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 61. Today I'm talking with Nikita Popov about a rather small RFC that he's proposing called stable sorting. Hello Nikita, how are you this morning?

Nikita 0:36

Hey, Derick, I'm great. How are you?

Derick Rethans 0:38

Not too bad myself. Let's jump straight in here. The title of the RFC is stable sorting, what does that mean, what is stable sorting, or what is sorting stability?

Nikita 0:48

Sorting stability refers to the behaviour of the sort when it comes to equal elements. And equal share means that we sort comparison function. For example, the one you pass to usort says the elements are equal, but there is still some way to distinguish them. For example, if you're sorting some objects, to take the example from the RFC, we have an array with users, and users have an age, and we use usort to only sort the users by age. Then according to the comparison callback all users with the same age are equal. But of course, the user also has other fields on which we can distinguish it. And the question is now in what order will equal elements appear. If we have a stable sort, then they will appear in the order they were originally in. So it's something not going to change.

Derick Rethans 1:41

And that is not what PHP sorting mechanism currently does?

Nikita 1:44

Right. PHP currently uses an unstable sort, which means that the order is simply unspecified. It will be deterministic. I mean if you take the same input array and sort it, then every time we will get the same result. But there is no well specified order or relative order of elements. There's just some order. The reason why we have this behaviour is that well there are, I would say, two, the only two sorting algorithms. There is merge sort. Which is a guaranteed n log n sort that the stable, but has the disadvantage that that requires additional memory to perform the merge step. The other side there is a quicksort, which is an average case n log n sorting algorithm and is unstable, but does not require any additional memory. And in practice, everyone uses one of these algorithms, usually with a couple of extensions on sort of merge sort. Nowadays we use timsort, but which is still based on the same underlying principle, and for quicksort, we have sort which is better than quicksort, which tries to avoid some of the bad worst case performance which quicksort can have. PHP currently uses us a quicksort, which means that our sorting results are unstable.

Derick Rethans 3:07

Okay, and this RFC suggesting to change that. How would you do that? How would you modify quicksort to make it stable?

Nikita 3:15

Two ways. One is to just change the sorting algorithm. So as I mentioned, the really popular stable sorting is timsort, which is used by Python by Java and probably lots of other languages at this point. And the other possibility is to stick with an unstable source. So to stick with quicksort, but to artificially enforce that the comparison function does not have, does not report equal elements that are not really equal. And we can do that by introducing an extra artificial fallback comparison. We remember the order of the elements in the original array. And as the comparison function tells us that elements are equal. You will check against this original order, which means that, okay are sort of still unstable, but because the comparison, we'll never actually report that two elements are equal unless they really equal. It doesn't matter for the result.

Derick Rethans 4:16

So you're basically artificially changing the key to have the original index in the array.

Nikita 4:24

That's pretty much exactly the implementation. And this is actually also how you would implement the stable sort if you'd do it in PHP code. So you would take your array and convert it into an array of pairs, where you have the original array value and the original position of the array element. Difference is just that if you do this in PHP code this is extremely extremely inefficient, in terms of memory and performance, while when we do it internally it's essentially free because we already have a little bit of unused space in each array element. We can easily store the current position.

Derick Rethans 5:02

Do you think there will be much of a performance hit here?

Nikita 5:04

So I expect that there is a bit of performance hit, but for typical usage, not much. For the good case where your array does not actually contain any equal elements, the overhead should be very small, something like maybe one or 2%,. If your array does contain a huge number of duplicates. Then there is more overhead, and the effect is basically that the sort performance, no longer depends on the number of duplicates you have. Previously if you had a lot of duplicates, then the sort became faster, the more duplicates you had. Well now, as you add more duplicates, the sorting performance will stay both stable. That's really the difference in performance.

Derick Rethans 5:53

If you have the numbers in the RFC I'll make sure to link to them. There are possibility that is that this is going to break any code?

Nikita 6:01

Yes, it could break tests.

Derick Rethans 6:04

Tests, because the test's output can change because the sorting order of arrays might have changed.

Nikita 6:11

Exactly. So we already had such a change in PHP seven, where we switched from a pure quicksort, to a hybrid quicksort and insertion sort, which means that effectively we have a stable source for arrays smaller than 16 elements and an unstable source for larger arrays, which is weird, weird intermediate state.

Derick Rethans 6:33

Yes.

Nikita 6:35

I think that one already had quite a bit of fallout for testing purposes. Hopefully this one will be a little bit smaller because most tests will work on a few elements. Those would have already been stable previously. But there is definitely going to be a little bit of fallout for unit testing.

Derick Rethans 6:56

At the moment we're talking about this, the RFC's already up for voting. By the time this podcast has come out. It's pretty likely that it has been accepted for PHP eight, because I think the voting was 51 to zero or something like this.

Nikita 7:10

It's 36 to zero.

Derick Rethans 7:13

There you go. Thank you, Nikita for taking the time this morning to talk to me about stable sorting.

Nikita 7:19

Thanks for having me.

Derick Rethans 7:23

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 60: OpenSSL CMS Support

PHP Internals News: Episode 60: OpenSSL CMS Support

In this episode of "PHP Internals News" I chat with Eliot Lear (Twitter, GitHub, Website) about OpenSSL CMS support, which he has contributed to PHP.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 60. Today I'm talking with Eliot Lear about adding OpenSSL CMS supports to PHP. Hello Eliot, would you please introduce yourself.

Eliot Lear 0:34

Hi Derick, it's great to be here. My name is Eliot Lear, I'm a principal engineer for Cisco Systems working on IoT security.

Derick Rethans 0:41

I saw somewhere on the internet, Wikipedia I believe that he also did some RFCs, not PHP RFC, but internet RFCs.

Eliot Lear 0:49

That's correct. I have a few out there I'm a jack of all trades But Master of None.

Derick Rethans 0:53

The one that piqued my interest was the one for the timezone database, because I added timezone support to PHP a long long time ago.

Eliot Lear 1:01

That's right, there's a whole funny story about that RFC, we will have to save it for another time but there are a lot of heroes out there in the volunteer world, who keep that database up to date, and currently the they're corralled and coordinated by a lovely gentleman by the name of Paul Eggert and if you're not a member of that community it's really a wonderful contribution to make, and they need people all around the world to send an information but I guess that's not why we're here today.

Derick Rethans 1:29

But I'm happy to chat about that at some other point in the future. Now today we're talking about CMS support in OpenSSL and the first time I saw CMS. I don't think that means content management system here.

Eliot Lear 1:41

No, it stands for cryptographic message syntax, and it is the follow on to earlier work which people will know as PKCS#7. So it's a way in which one can transmit and receive encrypted information or just signed information.

Derick Rethans 1:58

How does CMS, and PKCS#7 differ from each other.

Eliot Lear 2:03

Actually not too many differences, the externally the envelope or the structure of the message is slightly better formed, and the people who worked on that at the Internet Engineering Task Force were essentially just making incremental improvements to make sure that there was good interoperability, good for email support and encrypted email, and signed email, and for other purposes as well. So it's very relatively modest but important improvements, from PKCS#7.

Derick Rethans 2:39

How old are these two standards?

Eliot Lear 2:42

Goodness. PKCS#7, I'm not sure actually of how old the PKCS#7 is, but CMS dates back. Gosh, probably a decade or so I'd have to go look. I'm sorry if I don't have the answer to that one,

Derick Rethans 2:56

A ballpark figure works fine for me. Why would you want to use CMS over the older PKCS#7?

Eliot Lear 3:02

You know, truthfully, I'm not, I'm not a cryptographer, so the reason I used it was because it was the latest and greatest thing and when you're doing this sort of work. I'm an, I'm an interdisciplinary person so what I do is I go find the experts and they tell me what to use. And believe it or not, I went and found the person who's the expert on cryptographic signatures, which is what I need. I said: What should I use? He said: You should use CMS and so that's what I did. What I ran into some troubles though, which is that some of the tooling, doesn't support CMS. So, in particular PHP didn't support CMS. So that's why I got involved in the PHP project.

Derick Rethans 3:40

You are a new contributor to the PHP project. What did you think of its interactions?

Eliot Lear 3:45

I had a wonderful time doing the development. There was a fair amount of coding involved, and one has to understand that the underlying code here is OpenSSL and OpenSSL's documentation for some of its interfaces could stand a little bit of improvement. I needed to do a fair amount of work and I needed a fair amount of review so I got a lot of support from Jakub particular, who looks after the OpenSSL code base, as one of the maintainers, and I really enjoyed the CI/CD integration, which allowed me to check the numerous environments that PHP runs on. I really enjoyed the community review, and I really enjoyed it even though I didn't have to really do one in my case, I did do an RFC, as part of the PHP development process, which essentially forced me to write really good documentation or at least I hope it's really good. Before all of the caller interfaces that I defined, so it was a really enjoyable experience. I really liked working with the team.

Derick Rethans 4:47

That's good to hear. I think sometimes although an RFC wasn't particularly necessary here, as an RFC one particularly necessary I always find writing down the requirements that I have for my own software, first, even though this doesn't get publicized or nobody's going to review that always very useful to just clear my head and see what's going on there.

Eliot Lear 5:06

Yeah, I think that's a good approach.

Derick Rethans 5:07

During the review, was there a lot of feedback where you weren't quite sure, or what was the best feedback that you got during this process?

Eliot Lear 5:15

Biggest issue that we had was, how to handle streaming, and we have some code in there now for streaming, but it's it's unlikely to get really heavily exercised in the way that the interfaces are defined right now. It's essentially files in/files out interface which mirrors the PKCS#7 interface. One of the future activities that I would like to take on if I can find a little bit more time, is to move away from the files in/files out interface, but rather use an in memory structure or in memory interface. So that can actually take advantage of streaming and can be more memory efficient, over time.

Derick Rethans 5:56

When you say file now you actually provide a file name to the functions?

Eliot Lear 6:00

That's right, you know, depending on which of the interfaces you're using, there's an encrypt, there's an encrypt call there's a decrypt call. There's a sign and a validate call, and or a verify call, and each of them has a slightly different interface, but you know if you're encrypting you need to have the destination that you're encrypting through these are all public key, you know PKI based approaches so you have to have the destination certificates, that you're sending. If you're verifying you need to have the private key to do or you need, I'm sorry you need to have the public key chain and if you're decrypting to have the private key to do all this. So, but they're all filenames that are passed and it's a bit of a limitation of the original interface in that you probably don't really want to be passing file names from most of your functions you'd rather be passing objects that are a bit better structure than that.

Derick Rethans 6:53

Is the underlying OpenSSL interface similar or does that allow for streaming in general?

Eliot Lear 6:59

The C API allows for streaming in such. The command line interface, it doesn't seem to me that they do any particular things with with streaming. If you look at the cryptographic interface that we that we did for CMS, mostly it is an attempt to provide the capability that you would otherwise have on the open using the OpenSSL command line interface and I think the nice thing here is that we can evolve from that point.

Derick Rethans 7:26

And the progress wouldn't only be done implemented for the CMS mechanism, but also for PKCS#7, as well as others that are also available.

Eliot Lear 7:35

Yes. Another area that I would like to look at, I'm not sure how easy it will be, we didn't try it this time was to try and combine the code bases because they are so close, and be a little bit more code efficient, but there are just slight enough differences in the caller interfaces between PKCS#7 and CMS that, I'm not sure I could get away with using void functions for everything I have. I might have to have a lot of switches, or conditionals in the code. But what I am interested in doing for both sets of code is, again, providing new interfaces, where instead of passing file names, you're passing memory structures of some form that can be used to stream. That's the future.

Derick Rethans 8:22

I've been writing quite a bit of GO code in the last couple of months. And that interface is exactly the same, you provide file names to it, which I find kind of annoying because I'm going to have to distribute these binaries at some point. And I don't really want any other dependencies in the form of files, so I need to figure out a way how to do that without also provide those key files at some point.

Eliot Lear 8:43

Indeed, that's, that's an issue, and for us right well who are web developers I did this because I was doing some web development. A lot of the stuff that I want to do. I just want to do in memory and then pass right back to the client and I don't really want to have to go to the File System. And right now, I'll have to take an extra step to go to the File System and that's alright, it's not a big deal, but it'll be a little bit more elegant when I get away from that. We'll do that you know at an appropriate time.

Derick Rethans 9:11

Yes, that sounds lovely. I'm not an expert in cryptography either. I saw that the RFC mentions the X 509. How does it tie in with CMS and PKCS #7?

Eliot Lear 9:21

X 509 is essentially a certificate standard. In fact, that's what really what it is. A certificate essentially has a bunch of attributes, along with a subject being one of those attributes and a signature on top of the whole structure. And the signature comes from a signer, and the signer is essentially asserting all of these attributes on behalf of whoever sent the request. X 509 certificates are, for example the core of our web authentication infrastructure. When you go to the bank online, it uses an X 509 certificate to prove to you that it is the bank that you intended to visit, that's the basis of this and CMS and PKCS#7 are structures that allow the X 509 standard to be serialized, so there's the distinguishing coding rules that are used underneath PKCS#7 and CMS, and then what you have, CMS essentially was designed as at least in part for mail transmission. So how is it that you indicate the certificate, the subject name, the content of the message. All of this information had to be formally described, and it had to be done in a way that is scalable. And the nice thing about X 509, as compared to say just using naked public keys, is with naked public keys, the verifier or the recipient has to have each individual public key, whereas with X 509, it uses the certificate hierarchy such that you only need to have the top of the chain, if you will, in order to validate a certificate. So X 509 scales, amazingly well, we see that success, all throughout the web. And so that's what CMS and PKCS#7 help support.

Derick Rethans 11:24

Like I said, I've never really done enough research into this but I think it is something that many web developers should really know how that works because this comes back, not only with mail, but also with HTTPS.

Eliot Lear 11:35

It's another part of the code right. So CMS isn't directly used for supporting TLS connections, there's a whole a whole set of code inside of PHP for that.

Derick Rethans 11:44

Would you have anything else to add?

Eliot Lear 11:46

I would say a couple of things. The basis of this work was that I was attempting to create signatures for something called manufacturer usage descriptions. The reason I got involved with PHP is that I'm doing tooling that supports an IoT protection project. And this this manufacturer usage descriptions essentially describes what the device, what an IoT device needs in terms of network access. And the purpose of using PHP and adding the code that I added was so that those descriptions could be signed, and that's why Cisco, my employer, supported my activity. Now Cisco loves giving back to the community. This was one way we could do so it's something I'm very proud of when it comes to our company. And so we're very happy to participate with the PHP project. I really enjoyed working with

Derick Rethans 12:33

That's glad to hear. I'm looking forward to some other API improvements because I agree that the interfaces that the OpenSSL extension has aren't always the easiest to use and I think it's important that encryption is easy to use, because more people will use it right.

Eliot Lear 12:49

I have to say, in my opinion, the encryption interfaces that we have today are still relatively immature. And not just CMS, the code that I wrote, which is really you know fresh it just got committed, but the whole category of interfaces, is something that will evolve over time and it's important that it do so because the threats are evolving over time and people need to be able to use these interfaces, and we can't all be cryptographic experts, I'm not. I just use the code but I needed to write some in order to use it in my case, but as we go on I think will enjoy richer and easier to use interfaces that normal developers can use without being experts.

Derick Rethans 13:38

PHP has been going that way already a little bit because we started having a simple random interface, and in a simple way of doing hashes and verifying hashes, to make these things a lot easier because we saw that lots of people are implementing their own ways in PHP code, and pretty much messing it up because, as you say not everybody's a cryptographer.

Eliot Lear 13:56

That's right. And so that's a really good thing that PHP did, because as you pointed out, it eliminates all the people who are going onto the net looking for the little snippet of code that they're going to include in PHP, whether that snippet is correct or not that's a big issue.

Derick Rethans 14:11

Absolutely. And cryptography is not something that you want to get wrong.

Eliot Lear 14:15

That's right, because for every line of code that you've written in this space, there's going to be somebody who's going to want to attack it, maybe several.

Derick Rethans 14:23

Absolutely. Thank you, Eliot, for taking the time this morning to talk to me about CMS support.

Eliot Lear 14:28

It's been my pleasure Derick, and thanks for having me on. And again, it was really enjoyable to work with the PHP team and I'm looking forward to doing more.

Derick Rethans 14:38

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool, you can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP 8: A Quick Look at JIT

PHP 8: A Quick Look at JIT

Following on from a PHP 8/JIT benchmark on twitter, I decided to have a look myself.

I've picked an example that I know speeds up really well when reimplementing it in C. I wrote about this RDP algorithm some time ago.

What it does is to take a line of geospatial points (lon/lat coordinates), and simplifies it. It's my go-to example to show raw algorithmic performance, which is probably the best place to use a JIT for non-trivial code. I actually use this in production.

With PHP 7.4:

$ pe 7.4dev; time php -n \
        -dzend_extension=opcache -dopcache.enable=1 -dopcache.enable_cli=1 \
        -dopcache.jit=1235 -dopcache.jit_buffer_size=64M \
        bench-rdp.php 1000
Using array (
  0 => 'RDP',
  1 => 'simplify',
)

real    0m8.778s
user    0m8.630s
sys     0m0.117s

(I realise that the opcache arguments do nothing on the command line here). This runs RDP::simplify (my PHP implementation) 1000 times in about 8 seconds.

With PHP 8.0 and JIT:

$ pe trunk; time php -n \
        -dzend_extension=opcache -dopcache.enable=1 -dopcache.enable_cli=1 \
        -dopcache.jit=1235 -dopcache.jit_buffer_size=64M \
        bench-rdp.php 1000
Using array (
  0 => 'RDP',
  1 => 'simplify',
)

real    0m4.640s
user    0m4.627s
sys     0m0.008s

It jumps from ~8.8s to ~4.6s, a reduction in time of ~4.2s (or 48%), which is pretty good.

Now if I run the same with the geospatial extension which has a C implementation.

With PHP 7.4 and the extension:

$ pe 7.4dev; time php -n -dextension=geospatial \
        -dzend_extension=opcache -dopcache.enable=1 -dopcache.enable_cli=1 \
        -dopcache.jit=1235 -dopcache.jit_buffer_size=64M bench-rdp.php 1000
Using 'rdp_simplify'

real    0m0.695s
user    0m0.675s
sys     0m0.021s

Which gives a reduction in speed compared to PHP 7.4 of ~8.1s (or 92%).

So it looks like the JIT does do some good work for something that's highly optimisable, but still nowhere near what an implementation in C could do.

The code that I used is in this Gist.

This ran on a 4th gen ThinkPad X1 Carbon, making sure my CPU was pinned at its maximum speed of 3.3Ghz. Although I've pasted only one result for each, I did run them several times with very close outcomes.


PHP Internals News: Episode 59: Named Arguments

PHP Internals News: Episode 59: Named Arguments

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about his Named Parameter RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:18

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 59. Today I'm talking with Nikita Popov about a few RFCs that he's produced. Hello Nikita, how are you this morning?

Nikita Popov 0:35

Hey Derick, I'm great. How are you?

Derick Rethans 0:38

Not too bad, not too bad today. I think I made a decision to stop asking you to introduce yourself because we've done this so many times now. We have quite a few things to go through today. So let's start with the bigger one, which is the named arguments RFC. We have in PHP eight already seen quite a few changes to how PHP deals with set up and things like that we have had an argument promotion in constructors, we have the mixed type, we have union types, and now named arguments, I suppose built on top of that, again, so what are named arguments?

Nikita Popov 1:07

Currently, if you're calling a function or a method you have to pass the arguments in a certain order. So in the same order in which they were declared in the function, or method declaration. And what named arguments or parameters allows you to do is to instead specify the argument names, when doing the call. Just taking the first example from the RFC, we have the array_fill function, and the array_fill function accepts three arguments. So you can call like array_fill( 0, 100, 50 ). Now, like what what does that actually mean? This function signature is not really great because you can't really tell what the meaning of this parameter is and, in which order you should be passing them. So with named parameters, the same call would be is something like: array_fill, where the start index is zero, the number is 100, and the value is 50. And that should immediately make this call, like much more understandable, because you know what the arguments mean. And this is really one of the main like motivations or benefits of having named parameters.

Derick Rethans 2:20

Of course developers that use an IDE already have this information available through an IDE. But of course named arguments will also start working for people that don't have, or don't want to use an IDE at that moment.

Nikita Popov 2:31

At least in PhpStorm, there is a feature where you can enable these argument labels for constants typically only. This would basically move this particular information into the language, but I should say that of course this is not the only advantage of having named parameters. So making code more self documenting is one aspect, but there are a couple couple more of them. I think one important one is that you can skip default values. So if you have a function that has many optional arguments, and you only want to say change the last one, then right now you actually have to pass all the arguments before the last one as well and you have to know: Well, what is the correct default value to pass there, even though you don't really care about it.

Derick Rethans 3:19

If I remember correctly, there are a few functions in PHP's standard library, where you cannot actually replicate the default value with specifying an argument value, because they have this really complex and weird kind of behaviour.

Nikita Popov 3:33

That's true, but that's something we're trying to eliminate in PHP eight mostly.

Derick Rethans 3:39

And of course additional you'd never have to remember, whether in_array and array_search have needle or haystack first, which is also beneficial.

Nikita Popov 3:46

That's true. Yeah.

Derick Rethans 3:48

You mentioned that there are a few other benefits as well. You mentioned self documenting and the skipping of arguments, what other benefits are there?

Nikita Popov 3:54

The other part is that you can also reorder the parameters. So this varies a little bit by language. In some languages you're required to still pass the arguments in the same order. They were declared, even if you're using name parameters. But for the purposes of PHP, you would allow passing them in arbitrary order. Just like you said you don't have to remember if the haystack is first, or the needle comes first. And I think one case where all of these benefits, play together particularly well, is when it comes to object construction. So you already mentioned that we have the constructor promotion RFC in PHP eight, which makes it pretty simple to declare value objects. So you just list all the available properties and their default values and types, the constructor and you're done. But when you actually instantiate the object, you still have to, their ergonomics are not particularly good, because you have to remember in which order you have to pass the parameters, don't really know which parameters which just looking at the call. And once again, you have to specify everything and you can't just skip a few of them with default values. And if you have like a constructor with maybe five or six arguments coming in, which is maybe unusual for normal methods, but I think somewhat normal for constructors in particular, then the current development experience there is just not very nice. And named parameters would essentially provide us something akin to an object initialization syntax which is available in many other languages, and which has also been proposed for PHP, previously. But you would get this just as a side effect of combining constructors and named parameters, without having to define any kind of special semantics for how object construction works, and how initializer syntax interacts with constructors and so on.

Derick Rethans 5:55

That ties in again with the object ergonomics that I spoke about with Larry earlier this season as well.

Nikita Popov 6:01

Yeah, I believe that this combination of ,constructor promotion and named parameters for constructors was one of the things.

Derick Rethans 6:10

We've spoken a little bit about what it is. Now, how would you use this in PHP, what is the syntax for that you're proposing?

Nikita Popov 6:18

I mean syntax is always bike shedding question. The particular one, I am proposing for now is to save the parameter name as literal, so no dollar in front of it or something. And the colon and the value you want to pass.

Derick Rethans 6:35

Is there any precedence for this syntax already, either in PHP or outside of PHP?

Nikita Popov 6:41

In PHP, not really. I mean, PHP, we usually use the double arrow to have any kind of key value mapping. This is sort of key value mapping. In other languages, yes the syntax does exist. I'm actually not sure which languages exactly use it. Probably C sharp and Kotlin. Python uses just an equal sign. Well, there are a couple who use it. I actually initially use the double arrow syntax because it's more familiar with PHP, but I found that it's, there's not really read as nicely. And I also have some ideas on how we can, like, integrate this colon syntax, into the language in a more consistent way.

Derick Rethans 7:27

I think I saw in the RFC that the only said the only way how you can do the keys is by literal and not by a variable.

Nikita Popov 7:34

That's right. This is mainly just to avoid confusion. Well if you allow specifying a variable, then the question is, well, is this variable just the parameter name? Because I mean the signature, you also write this as a variable, or is it the variable that contains the parameter name like variable variables in PHP. So I think to sidestep that confusion, we just allow identifiers, but you can still use a variable parameter names from the argument unpacking syntax.

Derick Rethans 8:04

How does that work?

Nikita Popov 8:05

So PHP supports the three dots, the ellipsis operator, both in the function declaration, and for function calls. The declaration that just means collect all the trailing arguments. And the call, at the call, means that you get an array, and the elements of this array should be interpreted as function arguments. And parameters extend that by also allowing array keys. And if you unpack an array with string keys then those will be interpreted as parameter names, and we'll use the usual named parameters passing semantics.

Derick Rethans 8:47

Interesting. I actually missed that, while reading the RFC. To be fair, I skimmed it, not really read tit. Yeah it's good to see that actually. Now people currently use positional arguments and not named arguments. How would these two interact.

Nikita Popov 9:01

Mostly, the named parameters are just syntax for positional arguments, so we perform an internal transformation to convert named parameters into positional parameters. As far as both the engine is concerned and the callee is concerned. They don't really know about parameters that's all. They see usual positional call where all the missing arguments have been filled in with default values. I think the only part to watch out for there is exactly this case of variadics, because previously, the variadic parameter could only contain a list of arguments, and now it can also have string keys, or like left over named parameters. So which did not have a matching argument in the function signature so both will now get collected to the variadic parameter. Think that's like the only case where I know that the calling convention really changes for the recipient of the arguments.

Derick Rethans 10:02

Because otherwise got a normal array they now get a bunch of things with potentially having keys in there as well. What would happen if I specify a named argument by name and also include it into the variadics?

Nikita Popov 10:15

So generally the rule is always you can pass a parameter at most once you can have the situation where you first pass some positional arguments, and then you pass named arguments. If you do that this named argument cannot clash with the previous past positional argument, if you run in this kind of situation we will always throw an exception at that point. So you're not allowed to overwrite the previous argument, or something like that.

Derick Rethans 10:42

Same would work that if a method would collect named arguments and also have the variadics array. In case you specify more arguments then the function would take. And, in the variadics you'd have that name again that would have already clashed before it even gets turned into variadic. Are the names that she gives to named arguments are case sensitive or case insensitive?

Nikita Popov 11:04

They are case sensitive. Because the parameters you specify in the function are just variables and variables in PHP are case sensitive as well.

Derick Rethans 11:14

At the moment if you inherit a method in a inheriting class, then it doesn't particularly matter what the names of these method arguments are. When you get now named arguments, is this going to change, because at the moment PHP doesn't enforce that the names of inheriting methods are of course clashing, or the same as the ones that are overriding in the parent class?

Nikita Popov 11:37

This is one of the bigger open questions we have. The problem is that if you call a method with the names from the parent class, and the child class change them, then you'll get an error because this named parameter just doesn't exist in the child class. And there are a couple of ways to approach that one is to forbid during inheritance, any kind of parameter name changes, which would be a fairly significant backwards break because well, it never mattered in the past and based on some cursory analysis, this is like parameter name changes, somewhat common in code right now. The other possibility is to just ignore this issue, expect that a lot of code is never going to use name parameters. So using the parameters only makes sense with some types of methods. If you have a method that only accepts one argument can be pretty sure that no one's going to call it that has a name parameter, and there is the option of just ignoring this issue and fixing it as it comes up, more or less. Which is maybe not the most principled approach. But if we look at other languages that do make heavy use of parameters for example like Python. And we see that they also just ignore the problem. So it looks like in practice this does work out. Of course, a significant difference there is that Python has had in parameters for a long time already. We will be retrofitting them on an old language. So the situation is somewhat different and probably rather than more dangerous for us.

Derick Rethans 13:14

This is something of course that static analysis tools can check for quite easily and I would argue that they probably should start doing that as well.

Nikita Popov 13:22

This this right, so this is both something easy to check for, and also easy to automatically fix.

Derick Rethans 13:28

Except that you need to choose which one is the correct name, of course.

Nikita Popov 13:32

Yeah, that's right.

But there is one more possibility, which is to allow the parameter names from both the parent method, and the child method. This will be like more or less a transparent way to fix that issue. The only problem you can run into this if both the parent method and the child method use the same parameter name but in a different position. If we would go with this option then we say that only in this particular case where parameter name is reused but different position that would become an inheritance error.

Derick Rethans 14:04

I quite like that actually, because that's a pragmatic approach isn't it?

Nikita Popov 14:07

I also quite like it, maybe it's just technically a bit problematic.

Derick Rethans 14:11

I can already imagine that if this gets accepted for PHP eight, which of course not sure at the moment, that Xdebug is going to have to show the variadics already with the names array elements which of course it doesn't do yet because it has no notion of. But that's good to know to have a heads up on these things.

PHP eight has already seen quite a lot of work for internal methods to get their names properly, recorded as well, so that types of stubs that you have already been working on. How does named arguments tie in with this?

Nikita Popov 14:38

The actual named arguments proposal is already pretty old. It dates back to PHP 5.6, I think, and one of the open questions since then was how we handle internal control functions, because they don't really have a notion of default values. We have optional parameters, but the default value is not known to the engine, it's only known to the implementation. There are kind of ways to work around that. They are not really safe, so they will work for most functions, but for some which who like argument context, we might end up just crashing if this function is used with named parameters and particularly weird way. One of the nice things in PHP eight is that thanks to the stub effort we actually have default values for functions available as collectible meta data so it's available for reflection, and we will would also be able to use this for named parameters. If an internal function parameter has been skipped, we can essentially fetch it from reflection and fill in the value, the same way we would do for for normal user functions. The issue there is that this only works if there are stubs available. This works for all of our internal functions. I mean, not internal but bundled functions for PHP, but it will not work out of the box with old extensions. So it will mostly work, just this kind of parameter skipping is not going to work. So it will give you an error like okay we don't have default information for this function so you can't call it like this.

Derick Rethans 16:17

There's this common myth saying that reflection is actually a very slow thing, you should never use this in your code. Is this going to be a concern for using reflection information this way for internal functions?

Nikita Popov 16:29

Well, I mean the self like you will be directly using reflection, but internal API's that do the same thing. There is a performance concern here because we store the default values, not as values but as strings. So, in the worst case we actually have to parse those strings, convert them into a syntax tree, validate the syntax tree. That's all. That's of course slow, but it's not like we can't add a bit of caching in there to make sure this only happens once, at which point the problem should be avoided.

Derick Rethans 17:02

Especially when you use things like opcache.

Nikita Popov 17:04

I should say that I do expect name parameter calls to be generally slower than positional calls, so maybe in super performance critical code you would stick with the positional arguments.

Derick Rethans 17:16

I mean it would work perfectly well so far object construction still right?

Nikita Popov 17:19

For object construction the real cost is really in the object allocations so and so.

Derick Rethans 17:24

With the introduction of named arguments aren't going to be any BC breaks, potentially?

Nikita Popov 17:29

There are not going to be any direct BC breaks, but there are of course some concerns. The first one is the change I mentioned about the variadics. That variadics can now have string keys. But I should clarify what I mean by: no, no, BC breaks. If you don't use named arguments than nothing is going to break. But of course, if named arguments are used with code that did not expect them, then we can run into some issues. So that's one of the issues. And the other one is more of a like long term maintenance concern that if we introduce named parameters, then those parameters become significant to the API, which means you cannot rename parameter names in minor versions of a library if you're semver compatible. Because, you might be breaking some codes on using those parameter names. And I think one of the biggest concerns that has come up in the discussion is that this is a significant increase in the API burden for open source libraries.

Derick Rethans 18:34

Because now suddenly, they have to think about the names of the arguments to all their methods as well, right.

Nikita Popov 18:39

So I think, like, the merits of this proposal, mostly comes down to how much additional burden does this impose on people maintaining libraries versus how much like ergonomics improvements that we get out of the feature for everyone else. One more thing to consider is that named parameters really change how you design APIs or what APIs you can reasonably design. So right now if you have a method with, for example, three boolean arguments, that would be like a really horrible method, because you call it like, true, true, false, like what does this mean? If you have name parameters, and you have the same three boolean arguments, then it's not really a problem any more. So you can, of course, you say, what the argument means and you can leave out arguments that are that you don't want to modify.

Derick Rethans 19:30

You mentioned that this RFC is quite old already. Do you think this will make it into PHP eight, as we're getting closer and closer to feature freeze, we're not quite there yet we have another month or so to go. Do you think it's ready enough to throw to the lions, so to speak?

Nikita Popov 19:46

So I think I will at least give it a try, because I do think that PHP eight is a good target for such a change. Even though it nominally does not break backwards compatibility, it does have a very significant impact in practice, so it wouldn't be good to put this on a major version. And additionally, we also did all this work on stubs in PHP eight with this it'll also fits in very well. Oh, and finally, one thing I didn't mention before is that we get attributes in PHP eight. And attributes, firstly, replace the existing Doctrine annocation system, which already supports named parameter.

For all the code that is now going to migrate from Doctrine Annotations to PHP Attributes, it would be helpful if we had named parameters, because it would make the migration a lot more straightforward, because you don't also have to change the meaning of the arguments at the same time.

Derick Rethans 20:51

I'm curious to see what the reception of this will be, especially when it is going to be voted for.

Nikita Popov 20:57

Yeah me as well. I never did get this to voting, the last time around, but we should at least get a vote this time and well if it doesn't go through then there is always next time.

Derick Rethans 21:10

there's always next time yes. Okay Nikita Thank you for taking the time this morning to talk to me about named arguments.

Nikita Popov 21:17

Thanks for having me Derick.

Derick Rethans 21:20

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 58: Non-Capturing Catches

PHP Internals News: Episode 58: Non-Capturing Catches

In this episode of "PHP Internals News" I chat with Max Semenik (GitHub) about the Non-Capturing Catches RFC that he's worked on, and that's been accepted for PHP 8, as well as about bundling, or not, of extensions.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:18

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 58. Today I'm talking with Max Semenik about an RFC that is proposed called non capturing catches. Hello Max, would you please introduce yourself.

Max Semenik 0:38

Hi Derick. I'm an open source developer, working mostly on MediaWiki. So that's how I came to be interested in contributing to PHP.

Derick Rethans 0:50

Have you been working with MediaWiki for a long time?

Max Semenik 0:53

Something like 11 years, I guess.

Derick Rethans 0:56

That sounds like a long time to me. The RFC that you've made. What is the problem that is trying to address?

Max Semenik 1:03

In current PHP, you have to specify a variable for exceptions you catch, even if I you don't need to use this variable in your code, and I'm proposing to change it to allow people to just specify an exception type.

Derick Rethans 1:20

At the moment, the way how you catch an exception is by using catch, opening parenthesis, exception class, variable, and you're saying that you don't have to do the name of the variable any more. I get that right?

Max Semenik 1:33

Yes.

Derick Rethans 1:34

Is that pretty much the only change that this is making?

Max Semenik 1:38

Yes, it's a very small, and well defined RFC. I just wanted to do something small, as my start to contributing to PHP.

Derick Rethans 1:51

I'm reading the RFC, it states also that the what used to be an earlier RFC. How does that differ from the one that you've proposed?

Max Semenik 2:00

The previous RFC wanted to also permit a blanket catching of exceptions, as in anything. And that's all, which, understandably, has caused some objections from the PHP community. While most people commented positively on the part that I'm proposing now. Or should I say really propose because the RFC, passed and was merged yesterday.

Derick Rethans 2:35

I had forgotten about it actually, it's good that you reminded me. So yeah, it got merged and ready for PHP eight. Basically what you say you picked the non controversial parts of an early RFC?

Max Semenik 2:47

I actually chose something to contribute and then looked for an RFC, to see if it was discussed previously.

Derick Rethans 2:55

Oh, I see. So, your primary idea of wanting to contribute to PHP, instead of you having an itch that you wanted to scratch, it's like you're saying?

Max Semenik 3:04

I have way larger itches that I will scratch later when I will learn how to work with PHP's code base which, which is really huge.

Derick Rethans 3:16

That makes some sense I suppose. When looking at the vote for the RFC I actually couldn't see that you had voted it for yourself. I missed something?

Max Semenik 3:25

I don't have a php.net account so I can't vote for myself, obviously.

Derick Rethans 3:31

I actually think you can because you have written an RFC.

Max Semenik 3:35

I haven't seen any interface to vote.

Derick Rethans 3:38

Interesting. It's actually something to catch up on because I pretty much sure that you can. Should investigate that for some other RFCs that are still open because I think you should be able to.

Max Semenik 3:49

Would benice. I mean, this wouldn't change anything but..

Derick Rethans 3:54

That's true but I mean you've started contributing. If you be able to vote right that's the fair thing to do, I suppose. So as you said, this is your first contribution to PHP itself. How did you find the whole process of getting this going and getting started with it?

Max Semenik 4:10

As far running an RFC, it was fairly straightforward to me. Maybe because I was looking at PHP RFCs in the past, so I knew how the process worked and it was really something that I already knew how to navigate. It's not the first open source community I'm contributing to, so I kind of know what to do in general.

Derick Rethans 4:40

How large is the MediaWiki community?

Max Semenik 4:43

It's probably larger than PHP community in terms of actively contributing people, as in which the Wikimedia Foundation has lots of paid programmers that work on the ecosystem. Obviously the outreach of your community is larger than MediaWiki's.

Derick Rethans 5:08

You're saying that there's more people working on, on it. But there's more people using PHP?

Max Semenik 5:15

And more people actively interested in development.

Derick Rethans 5:21

Do you think that's because it's easier to contribute to something that's written in PHP, than PHP itself?

Max Semenik 5:28

Not a lot of people know how to program in C these days. And while I used to be paid for writing C, my C's currently extremely rusty. Unlike PHP, for example.

Derick Rethans 5:44

For me it's sort of the other way around, because I haven't been writing PHP code for quite some time now, except for some test cases, so I know nothing about frameworks whatsoever. I know C pretty well. In any case, we now have one more active contributor, that is you, that is you. You've things merged that makes you a contributor, in my eyes. As this is a pretty small RFC. And I think during the course of the last few months we have I've discussed with several other contributors that small RFCs are a good thing, because it makes it much harder to find problems with. There are a few other RFCs as well that are also so small and for which the authors declined to talk to me about that for various different reasons. And two of those are actually really really simple things, and they are both having to do with the bundling of extensions in PHP. Now, just thinking about this question. How does MediaWiki, for example, think about which extensions, it can use in its source code?

Max Semenik 6:45

For MediaWiki. First of all, on start-up MediaWiki quickly checks if all the hard required extensions are available, and they just bails out if they aren't available. I need to look, whether it checks for JSON or as soon as it's way too obvious to even consider whether it's present or not.

Derick Rethans 7:10

So you just mentioned the JSON extension. That makes sense because that's one of my notes. One of the RFCs as you just alluded to is to JSON extension, and PHP eight will have this always available now without you having to enable this in configure flags, which is pretty good way of making sure that extension is always available to everybody using PHP. Do you agree that having a JSON extension always available is a good idea for PHP?

Max Semenik 7:37

Yes absolutely. One of the aspects of writing software that's available for everyone to use, as opposed to some internal company software that's running on a few servers and that it, is that the you need to support a wide variety of systems. And if it's possible to compile PHP without JSON, it means that someone will compile without it. It also means that some Linux distribution developers will package it as a separate package, and then someone will not install it, and you will get people to complain that MediaWiki doesn't work on their system. For more, very popular extensions are available. If I will know that many popular extensions that I need, are always available, it makes my job easier and it also allows me to write better software, without having to resort to hacks and decrease the functionality.

Derick Rethans 8:52

An what some other framework to do this they start making polyfills for them.

Max Semenik 8:56

And these polyfills might have vital like orders of magnitude worse performance. If I can have guarantees that a system has JSON, as well as other extensions like mbstring, intl, and so on, it would be really awesome.

Derick Rethans 9:16

The argument always between, do we always want to have everything inside PHP or not, and at some point you need to start making a distinction about is this useful enough for everybody or just for a smaller group of people, and mbstring is probably an example where this is sort of, sort of on the line right. I mean it's useful enough, but is it useful enough to have it always enabled instead of having it easily installed as a package.

Max Semenik 9:42

Well you know lots of people are running software, whether it's MediaWiki whether it's some WordPress or something else on crappy shared hosting, which is the bane of every programmer's existence but they still have to support it. The question is really something can be messed up. Some people will have to run a node on systems that have messed up. And if we can avoid it. Why not?

Derick Rethans 10:11

Another RFC that's just gone through its unbundling extension. Some versions of PHP will have extensions, being brought into core and being always made available like we did with the hash extension in PHP seven four. But of course we also removing extensions from PHP to live somewhere else. Not even having them always enabled but not even having them distributed with a PHP source code. In PHP seven four we had for example the Firebase extension, I believe, because there wasn't a lot of people using this. In this case we having the XMLRPC extension. Have you ever heard of this XMLRPC extension, because you said you've been programming PHP for a while?

Max Semenik 10:51

I've heard about the protocol itself and I might have heard about PHP having this extension, but I've never used it, and honestly I don't know why anyone using it.

Derick Rethans 11:04

It's sort of being used a little bit when people really didn't want to use SOAP, because it was too complicated. But before we had invented JSON pretty much. That's a long long time ago.

Max Semenik 11:18

These days. XMLRPC is sounds like a legacy corporate system. That's why probably, it's no use having it in PHP proper.

Derick Rethans 11:32

I think I very much agree there. In any case, non capturing caches are in PHP eight. You said that the RFC was saccepted, has the patch being merged as well.

Max Semenik 11:41

Yep.

Derick Rethans 11:42

Great. I'm going to have to have a flavour that I'm going to give a talk next month for the Dutch PHP conference, where I'm talking about a new additions in seven four, but also what's coming up in eight dot zero, I might be able to have a slide about it in there.

Max Semenik 11:57

Awesome.

Derick Rethans 11:58

Thank you, Max for taking the time today to talk to me about non caption captures and bundling of extensions.

Max Semenik 12:05

Thank you, Derick for giving me this tribune. It was a nice talk.

Derick Rethans 12:09

Excellent. Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 57: Conditional Codeflow Statements

PHP Internals News: Episode 57: Conditional Codeflow Statements

In this episode of "PHP Internals News" I chat with Ralph Schindler (Twitter, GitHub, Blog) about the Conditional Return, Break, and Continue Statements RFC that he's proposed.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:17 Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 57. Today I'm talking with Raluphl Schindler about an RFC that he's proposing titled "Conditional return break and continue statements". Hi Ralph, would you please introduce yourself.

Ralph Schindler 0:37 Hey, thanks for having me Derick. I am Ralph Schindler, just to give you a guess the 50,000 foot view of who I am. I've been doing PHP for 22 years now. Ever since the PHP three days, I worked in a number of companies in the industry. Before I broke out into the sort of knowing other PHP developers I was a solo practitioner. After that I went worked for three Comm. And that was kind of a big corporation after that I moved to Zend. I worked in the framework team at Zend and then after that, I worked for another company based out of Austin for friend of mine Josh Butts. That offers.com, we've been purchased since then by Ziff media. I'm still kind of in the corporate world. Ziff media owns some things you might have heard of, PC Magazine, Mashable, offers.com. The company that owns us owns is called j two they are j facts. They keep buying companies, so it's interesting I get to see a lot of different products and companies they get bought and they kind of get folded into the umbrella, and it's, it's an interesting place to work. I really enjoy it.

Derick Rethans 1:39 Very different from my non enterprise gigs

Ralph Schindler 1:43 Enterprise is such an abstract word, and, you know, it's kind of everybody's got different experiences with it.

Derick Rethans 1:49 Let's dive straight into this RFC that you're proposing. What is the problem that this RFC is trying to solve?

Ralph Schindler 1:54 This is actually kind of the bulk of what I want to talk about, because the actual implementation of it all is is extremely small. As it turns out it's kind of a heated and divided topic, My Twitter blew up last weekend after I tweeted it out, and some other people retweeted it so it's probably interesting. I really had to sit down and think about this one question you've got is what is it trying to solve. First and foremost, it's something I've wanted for a really long time, a couple years.

Two weekends ago I sat down and it was a Saturday and I'm like, you know what I haven't haven't hacked on the PHP source in such a long time. The last thing I did was the colon colon class thing, and I was like seven or eight years ago. And again, I got into that because I really wanted the challenge of like digging into the lexer and all that stuff and, incidentally, you know, I load PHP source in Xcode, and my workflow is: I like to set breakpoints in things, and I like to run something, and I look in the memory and I see what's going on and that's how I learned about things. And so I wanted to do that again. And this seemed like a small enough project where I could say, you know this is something I want to see in language, let me see if I can hack it out. First and foremost, I want this. And, you know, that's, it's a simple thing.

So what is it exactly is, it's basically at the statement level of PHP, it is a what they like to call a compound syntactic unit. Something that changes the statement in a way that I think probably facilitates more meaning and intent, and sometimes, not always, it'll do that and fewer lines of code. To kind of expand on that, this is a bit of a joke but a couple years ago there was that whole argument online about visual debt. I don't know if you remember hearing that, that terminology.

Derick Rethans 3:34 Yep.

Ralph Schindler 4:47 Foo.

Derick Rethans 4:23 Up to now we haven't spoken about but the RFC is proposing so maybe we should talk about it first and then get back to other things that he said have you spoken a little bit about the reasons why you want to change something. But what would you like to add to PHP or, or what would you like to modify in PHP?

It's, you know, it's, it's very closely related to what in computer science is called a guard clause, and I used that phrase lightly when I originally brought it up on the mailing list but it's very closely aligned to that, it's not necessarily exactly that, in terms of the syntax. In terms of like when you speak about it in the PHP code sense, it really is sort of a change in the statement; so putting the return before the if. That's really what it is. So guard clause, it's important to know what that is, is it's a way to interrupt the flow of control, you know, over the history of programming languages.

Ralph Schindler 7:19 Let's just go back to Pascal. Pascal like 50 years ago, there was no opportunity in Pascal code to exit early from either a loop, or a method, so you had to wait until you got to the very final sort of statement, and there was a single exit from a function. Guard clauses allow you to effectively, if you're inside of a block of code, or a loop, or some kind of flow of control. It gives you an opportunity to say I want to exit here instead of continuing on. They did a whole bunch of studies on Pascal and they found out that students were like, they couldn't come up with the right solution when let's say if you had a loop statement, it had to execute 100 times there was no opportunity to get out early. When you gave them the opportunity to interrupt the flow control the correctness of their solutions, ultimately got better. Almost 100% of the time they were able to, you know what this is an exceptional piece of code, I want to exit here.

Fast forward guard clauses, they're kind of, if you've kind of followed the Kent Becks and the Martin Fowlers they would argue for guard clauses. Y'know over the line that's gotten more popular as an argument over the past, let's just say 15 years in our industry

Derick Rethans 8:23 Would another term for this be like an early return?

Early returns are one of them, early breaks, and early continues, so getting to a place in code where you just say you know what this, there's a particular condition, in this normal flow of execution, I want to stop that normal flow and I want to break out of it. Goto is another tool that allows you to do this. I don't know if you can do it inside of loops, maybe you can. There's like some exceptions in PHP where you can jump to and from,

You can jump out of loop, but you can't jump into one.

To some degree, these tools do sort of exist, goto, another heated topic in the PHP world. So getting back to what the guard clause is. More specifically, it's, it is very closely, and semantically aligned with a Boolean expression. You will generally say, I want to either return, break, or continue, based off of this Boolean. PHP itself does not have first class support for guards. The way we achieve it currently is, we will put the Boolean expression first, and as part of a block of code associated with that, so: if curly brace block of code, that might terminate in a early return. Inside of switch statements or loops, you'll see that if something something something continue one continue two, or break one break two. Return expression, break continue, along with a return or break expression, is the way we achieve it in PHP. This is kind of giving first class support to a guard clause. It would spell it out in the manual and it would be a tool that since it has a name, and it isn't the language, programmers could reach out and say, I know what that is, or: Here's what it is in the manual, how do I use that? That's kind of, you know what a guard clause is.

At the moment, if you mentioned the guard clause you can sort of implement by doing: if, your condition and then a curly braces return, or break, or continue, whatever you set. What is the syntax that you want to replace this with?

I don't want to replace syntax. PHP is a flexible language. We have multiple ways of doing lots of things. We have multiple ways of crafting closures and anonymous functions. We have two different ways that have existed since the beginning of PHP's time for doing if statements, one can be broken up by the, the semicolon, with the block the endif, or you can do with curly braces. You've noticed that with various PSRs and whatnot that people have gravitated towards a particular coding standard. And that, for all intents and purposes for the global community of programmers to have the shared diction, that's a good thing.

Ralph Schindler 10:50 With regards to PHP. So the most important characteristic of this RFC is that it is now, PHP is a left to right language, you know like much of the 90-95% of the speaking world left to right. They tend to put the emphasis, especially encoding of precedence on the left side. So this moves the return keyword to the left side of a statement or syntactic unit. So when you look at this code. The first thing you see is: return. In the variation one, which is the one I proposed of this, this feature, "return" is followed by "if", what you notice is that when you look at code you'll see "return if", and almost looks like its own key word. Those two individual, you know tokens, those key words must align themselves closely together exactly. You know, maybe there's like two spaces between them but return if are right next to each other, they can be treated almost as a new keyword and of itself. So as you're reading code top down, left aligned, you'll see return if, return if, finally at the bottom method, you'll see return. So that's variation one and what it does is it creates sort of this precedence that the keywords you know the static constant keywords return an effort first. Your expression is third. Your optional return value is fourth. In most of the cases where you're writing this, it does become a one liner. That's not to say we can't do one liners today, because you can do: if, if-expression, something, return. But what happens when you look at that code is that the return value is off to the right. Optionally if you don't, if you want to break outside of the PSR coding standards, or with the PSR coding standards. You can do curly braces and then put the return on the next line, now you got three lines of code, you've returned is indented. As you're visually approaching this code. See, you know what's most important to you is that there's a if statement there, but then you have to kind of scan the body of that to see if there's an early return. The fact that it's an early return in variation one becomes abundantly clear at the leftmost rail of the code, at the leftmost side of the statement, assuming you're not putting all of your code on one line.

Derick Rethans 12:59 You talk about variation one, I guess there's a variation two as well. What is the difference between them?

Ralph Schindler 13:05 As with RFCs, people have preferences and they have. Just with politics in general, if you're in a political position, which this is a political changes to PHP, you have to know where your constellations are. You have to know, basically, if I want to appease the most amount of people like what will I have to give up in order to get something that is still beneficial to me. For me right now, it is the compromised position. That's not to say I won't like it more, maybe a month from now on, but effectively the variation two is moving the optional return value after the Return. Return, optional return value, then the if, i f, and then the optional, not the non optional if expression, followed by the semicolon. So basically it would read more like English, so to speak. Return this, if this. What I understand it is that way in Perl. I know it's that way in Ruby. So Ruby follows the same thing because the way they've implemented it is not necessarily in a single statement they've, they've implemented what they call a statement modifiers, which is any statement can be modified with this conditional at the end of it. That's the alternative syntax. If I were to use this, I get value out of it because maybe I don't return an optional expression and then I'm still left with return if this. I still have my escape hatch for methods that have an optional return, the ability to return void.

Derick Rethans 14:26 In variation one, how do you separate out the condition with the optional return value?

Ralph Schindler 14:32 Another reason why I thought variation one was good for PHP specifically. Let's just do like two seconds of history. If you go back 20 years, C++, the way you write a method signature in C++ is: you'll do public, int, method name, typed arguments, so the return, we call them, hints, the hint for the method in C++ precedes the method.

Derick Rethans 14:55 I've just been talking to Dan Ackroyd for the podcast episode that came out last week, where he is saying that we should stop calling it hints, because they're no longer hints, they're not proper type names. Maybe we should pick that up here as well than?

Ralph Schindler 15:10 We've had that discussion for 10 years now. But people know them as hints. We've such loaded phrasing and PHP like type coercion. Whatever we call them, I'll just continue with hints for the time being, because that's the audience at this particular podcast knows them as hints. The hint in C++ would have been all the way to the left of the line, whereas in PHP when we chose to implement typing of the return values, we did it in a way where it was the method signature had the semi colon and the return type at the end of the method signature. This particular variation one, this follows that same pattern, where your semi colon return value looks exactly how the layout of the method signature is where it's semi colon, what you see up top. There's a big parallel there between an early return with an optional return value. Also, I like optional things to be at the end. And when you look at this whole statement that's the optional part, whereas when variation two the optional part being in the middle means return optional part if, or return if are both valid things. So parallel is the method signature. That was kind of why I personally like the first one. They're both my children at this point I love them both.

Derick Rethans 16:20 As you said, introducing syntax is always a bit tricky and it's a political choice. What has been sort of the feedback and, and or the criticisms, to your suggested that additional language constructs?

Ralph Schindler 16:33 Smallest changes always get the most feedback, because there's such a wide audience for a change like this, like they can immediately see the benefits or negative value of it in their own code, all the way from the junior programmer, all the way up to the senior programmer, I can't quantify who's Junior new senior, I can't also quantify who has been programming a long time and it was, for lack of a better term set in their ways and likes their style versus those who have adopted a certain flexibility in the way that they develop and like the size of the team they're on and how much of a leniency they put on someone else to write code that they will just you know code review and accept. So the interesting thing is that you have to kind of understand Junior programmers, or senior programmers. When the junior programmer gets in there, and they start programming, they tend to write code that is very brute force, they just write a lot of code because in order to get better at writing code you just keep writing code. To them, their perspective is from the code writing standpoint, they're not looking at this from a code reading standpoint, they're looking at it from a writing standpoint. So when you see a junior programmer they rely on ifs and loops and like the rudimentary techniques, less abstraction, fewer methods, more lines of code. They tend to not break things out into well equipped to well named methods. Whereas as they grow as programmers they start reading other people's code more and then they do start appreciating abstraction like this 50 line thing needs to be a five line thing. It needs to have its own name as a method over here, I need to reduce the number of inputs, have a very specific outputs, so on and so forth. So it's more highly structured code. Putting a feature out, you know like this, you get a range of perspectives from people. It goes without saying. I mean, Taylor retweeted it, I know he has a preference for this style of programming. I know exactly where it came from. He appreciates certain things in like the Ruby world, the return if statements in Ruby is a clear, concise, and very impactful statement, and too much of a degree he's, he's implemented that same thing in Laravel. So if you look at the helper methods in Laravel someone that writes Laravel applications is used to using something like abort if, or throw if. Interesting side note here, PHP is going to have a feature where you can put a throw expression, following a ternary. That in and of itself, allows exceptions to have a much more concise syntax. It allows you to use PHP exceptions for flow control. So you still can't do that with a return value for example, you can't have it a ternary with a return value. And I guess that is another way of being able to do achieve the same thing. This idiom, of being able to going back to guard clauses, and going back to thinking about early exits of methods, this was prevalent in Laravel where you could say in a controller method, and this is specific to an HTTP context, because you're inside of a controller, abort if, abort is highly specific to HTTP, where are you going to return a 404 or 500, it's going to throw an exception, an HTTP exception, which the framework knows to convert these kinds of exceptions into error paths in an application. So again we're still talking about application code, not necessarily library code. So abort if and abort unless is an idiom that I've seen is a fantastic idiom for controllers. I mean you can when you're thinking about a request which PHP is highly request driven, you can see when I start this method with the request object, you know, these are all my early outs, you know, this is where I'm going to return, and then at the very final spot I might be returning a view, which is a successful page for this MVC application. I feel like it was a successful idiom there and that was also part of the reason that drove me say, you know, it would be neat. If I could just say, return response, if this condition and have that early out.

Derick Rethans 20:12 What's been the biggest criticism so far?

Ralph Schindler 20:15 Biggest criticism is we can already do this. See, I hear that all the time, with all sorts of other features to varying levels varying degrees. I can do this with if something return something early. I said earlier that the proposed syntax might not be shorter and that's true. It is just changing the order of the operators, or the order of the keywords but, you know, that's an important distinction, like I want the precedence of the return to be earlier in the line. I think that's the important distinction. And I feel like maybe people that are saying it doesn't reduce the amount of code need to take that into account. And it's hard to see it really take that into account, unless you see variations of this sort of mental model of code. That's on me. I've been taking all the sort of like criticism, I'm kind of in a cooldown phase right now. I've been looking, I've been watching Twitter, I've been watching the Reddit. It's generally cooled down on internals mailing list, and I'm just kind of thinking about it because going back to likening this to a political sort of thing is that I have to rephrase my argument so that people that have a very firm stance on: I don't like this because I don't like it, or I don't like this because it doesn't shorten my code. I have to find an argument that gets them to start thinking about why this might be a good thing. I understand like this might get shot down in PHP. Right now, if I was a betting man, we were in Vegas, and someone asked me: Do you think this is going to go through, I probably would have to bet against myself I think 40-60. The temperature that I've taken on internals and everywhere else seems to indicate that it wouldn't be successful, but I'm collecting my evidence right now and putting out a blog post that kind of explains why it is, what it is, and putting a better argument forward. If that can't push it over the threshold, you know, I'll accept the defeat, so to speak, look at the history of PHP: annotations, and whatever they were called attributes, eight years ago were shot down. And, interestingly, I use the annotations back in the day with doctrine, I'd no longer use doctrine. So I voted to accept them. I might have voted to not accept them eight years ago, and I voted to accept them now, even though I don't use a variation of that any more.

Derick Rethans 22:15 There's a few things that keep changing over time, right, first of all people turn from junior programmers into senior programmers, so they think about how to structure code more and more. And at the same time they also start seeing the value of some things that PHP never had right and. A good example is the scalar typing, that's been spoken about for maybe 15 years even, and it took so many different approaches, and as you say attributes, although attribute is a little bit different because this RFC is absolutely not the same as the earlier ones where the implementation is quite different from the version one then end up solving lots of problems that people found with the original RFC.

Ralph Schindler 22:53 I have not been part of sort of the global PHP community. I started in the mid, 2000s. And having worked with PHP since 1998. I remember the early days where PHP was not fast at all. It was as fast as other things, but I gravitated towards it because I liked the syntax. Back in that day, I would have had more of an emphasis on things that would run faster, regardless of how they look because, I had projects for example in college I wrote a program where kids would go up and like on Valentine's Day, put all their preferences in. That was a week leading into Valentine's Day, and then on Valentine's Day they could come back to the University Center, and get a printout of all the other people that have fill out the questionnaire, and matched. When you have 1000 people fill out a questionnaire, this was PHP in 2000, 99 on 2000. And when I tell you, it took hours for the script to run and calculate all of the matches for a person, changing just the way an if statement would run, or changing the way you early exited an if statement when you know that you had to filter out a person. It changed the output by hours. The code was very, very closely aligned to like the performance, whereas now, PHP eight: I don't think that we have so many more affordances. You don't have to think about: Should I interpolate strings inside of a single quote or double quote, like none of that matters any more. We've solved all those problems. You can call sprint off just as quickly as you can do an echo, echo out and no one really cares, it's gonna perform the same. Wasn't the case 20 years ago, it is the case now, so now we have this affordance where we can look at the, you know, for lack of a better term, you know, is the code pretty, like is it easy to read.

Derick Rethans 24:32 Thank you all for taking the time this afternoon, or in your case morning, I think, to talk to me about your RFC. I'm looking forward to seeing this coming to vote at some point.

Ralph Schindler 24:43 I appreciate you having me on the, on your podcast. Thank you.

Derick Rethans 24:47 Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 56: Mixed Type v2

PHP Internals News: Episode 56: Mixed Type v2

In this episode of "PHP Internals News" I chat with Dan Ackroyd (Twitter, GitHub) about the Mixed Type v2 RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:20

Weekly a podcast dedicated to demystifying the development of the PHP language. This is Episode 56. Today I'm talking with Dan Ackroyd about an RFC that he's made together with Mate Kocsic it's called the mixed type version two. Hello, Dan, would you please introduce yourself?

Dan Ackroyd 0:38

Hi Derick. So my name is Dan Ackroyd, also known as Dan Ack online. I maintain the PHP image extension. And I also contribute to PHP internals illegitimate by maintaining some documents that called the RFC codecs that are a set of notes of why certain ideas haven't reached fruition in PHP core, and occasionally I help other people write RFCs.

Derick Rethans 1:04

Continuing with the improvement of PHP type system in the last few releases. And we've seen a few more things coming into PHP eight but union types. For a long time, there has been an issue with PHP's internal functions that the type that a return cannot necessarily be represented in PHP type system because they do strange things. It is RFC building more on top of PHP's type system. What is this is trying to solve?

Dan Ackroyd 1:29

There's a couple of different problems that's trying to solve. The one I care more about is userland code, I don't actually contribute that much to internals code so I'm not that familiar with all the problems that has. The reason I got involved with doing the mixed RFC was: I had a library for validating parameters, and due to how that library needs to work the code passes user data around a lot internally, and then back out to whether libraries return the validators result. So I was upgrading that library to PHP 7.4, and that version introduced property types, which are very useful things. What I was finding was that I was going through the code, trying to add types everywhere occurred. And there's a significant number of places where I just couldn't add a type, because my code was holding user data that could be any other type. The mixed type had been discussed before, an idea that people kind of had been kicking around but it just never been really worked on. That was the motivation for me, I was having this problem where I couldn't upgrade my library, as I wanted to, I kept forgetting has this bit of code here, been upgraded. And I just can't add a type, or is it the case that I haven't touched this bit of code yet. So coincidentally, I saw that Mate was also looking at picking up the RFC, and he had copied the version that Michael Moravec had been working on previously. I want as I mentioned earlier, I help people write RFCs is for a lot of people where English isn't their first language, it's a difficult thing to do writing technical documents in English. I also think that writing RCFs in general is slightly harder than people really anticipate. Each RFC needs to present clearly why something's a problem, why the proposed solution would work, snd, at least to some extent why other solutions wouldn't work. Looking at the text from the previous version I could see the tool though, I understood, all of the parts of that RFC, I don't think that it made the case for why mixed was the right thing to do in a very clear way. So I spent some time working with Mate to redraft the RFC, discussing it between ourselves and going through a few of the smaller issues before presenting it to internals, for it to be officially discussed as an RFC.

Derick Rethans 3:51

Where does the name mixed actually come from?

Dan Ackroyd 3:54

So, mixed is actually a very old concept in PHP it's been used in the docs for multiple decades. I think we have multiple core contributors who are younger than the mixed type, which is an interesting situation for a language to be in. It had been used in the documents, all over the place. It has been used to show that the type of a parameter, or return type from functions was quite complicated. It's actually slightly different from how people might use it in userland code. A lot of the places where it's used in the docs would now use a union type there instead of the mixed type. But there are still places where mixed is the correct type to use in the documents.

Derick Rethans 4:40

This being an RFC, you're proposing something to do in it. What are you proposing to introduce into PHP?

Dan Ackroyd 4:46

To be precise, the RFC proposes being able to use the word mixed as a type to be used for parameter types, return types, and property types and mixed is really a shortcut for something that can be done in Union types, mix is the equivalent of writing array or blue, or callable or int, or float, or no object or resource or string. One of the benefits of mixed is that it's much shorter to type but the full equivalent to that.

Derick Rethans 5:18

And you'd have to do is every time you use it.

Dan Ackroyd 5:20

It's particularly hilarious when you've got a function that accepts any type of parameter, and then returns that parameter, that's been modified. So you have mixed on the way in, and mixed on the way out, having all of those words on the same line of code is just too much.

Derick Rethans 5:35

Does the mean that makes is pretty much implemented as a union type?

Dan Ackroyd 5:39

I have no idea. I'd have to refer you to the actual implementation which I can't recall the details off, off the top of my head. The actual internal type checking in PHP is not as clean as you might imagine, from userland, particularly around things like callable, that's not, it's not a straightforward path of code for tracking, whether something's callable. It works as union type, but how it is actually implemented internally, is probably more detailed than that.

Derick Rethans 6:07

I'll have a good book, a little bit later than. As, you set a sort of acts as a union. But Union types, and variance are quite tricky. And then I spoke with Nikita about union types, it wasn't the clearest explanation because it's a really difficult concept, right. So how does the mixed type interact with variance in either arguments or return types properties?

Dan Ackroyd 6:30

I agree completely. Variance's complicated thing, and liskov substitution principle is a reasonably complicated thing. Full disclaimer here, I am not a computer scientist, I didn't study computer scientists in University. I studied chemistry and molecular physics, and the only formal education I've had in programming, was a single 10 hour course that taught us how to use Fortran 77, which is a lovely language for the 70s, not quite so good for the 1990s when I was learning it. I think people concentrate too much on the theory behind computer science. If I read out the general rule of LSP or liskov substitution principle. It says: For each object O1 of type S, there is an object of type T, such that for all programs P defined in terms of T, the behavior of P is unchanged. When O1 is substituted for O2 and S is a subtype of T. I don't fully understand that. I mean I can go through it and understand it in principle, but I don't understand it. I don't grok it at a fundamental level when I'm writing code, for me a better way of thinking about LSP is to simply say that: if your code follows LSP, then it's probably not going to blow up. If you violate LSP, your code has a very good chance of blowing up. For both parameter types and return types, the way that PHP implements the type checking through variantce, the type checking is done to make it conform with LSP, but the simplest way of putting it is: make sure that your codes not going to blow up on bad assumptions about the types that being passed around.

Derick Rethans 8:17

Because PHP does it adhere to LSP your lovely new mixed type does have to adhere to it. How does your lovely new mixed type tie in with LSP and variance specifically because mixed is a little bit special. In some cases, because at the moment PHP if you have a method. And you return nothing from it, sort of acts like mixed. So I saw that in the RFC there is a specific handling of having no arguments going to mixed and then back to no type.

Dan Ackroyd 8:48

The RFC; one of the details, is when no type is present for a functional term the signature checks for inheritance are done as if the parameter had a mixed, or void type, so that's a union type of mixed and void. That's the correct thing to do. It makes the code work as you'd expect it to do, and avoids any possible scenarios where you'd make an assumption about the method in the parent class, and that assumption not being true in the child class. I think this is one of the areas where PHP's special behaviour, shines through. This might not be an acceptable solution to people who work in languages that have a cleaner type system, but they probably stay well clear of PHP to begin with, but the details of how it works means that the code behaves as you'd expect it to and doesn't blow up.

Derick Rethans 9:42

Well, that's the reason why void isn't part of the mixed union?

Dan Ackroyd 9:47

Mixed and void are related, but quite different from each other. Mixed is a guarantee that for return types. It's a guarantee that a parameter will be returned, but you can't, we can't give you any more details of what the type of that parameter will be. Void, is a guarantee, in quotes, that no value will be returned. I actually strongly regret void being present in PHP. I think it was a mistake. One of the very nice things about PHP is the way that every function returns null, even if you don't have a return statement in that function. This is something that's quite different to a lot of other languages where it's common to have functions declared as void return type, so there's no return value at all. Because PHP always return null, it allows you to do things like var dump, then put a function inside var dump bracket, and that's always guaranteed to not blow up.

I would have strongly preferred us to introduce the null type to PHP, and for people to use that, when they're not returning a more semantically meaningful value from their function. I think that would actually be a lot better into the PHP type system, and make it a lot easier to write code, that's chainable.

Derick Rethans 11:10

The only real locations where it can't return any values is a constructor and a destructor in PHP.

Dan Ackroyd 11:16

It would still have a use for functions that never return. So like continual loops, and also functions that only ever exit for by throwing an exception. I think TypeScript has this, I think they call it none. I can't remember the details but it has its uses but the way that most people are using it in PHP is wrong, in my opinion. The reason I still get a little bit worked up about this is because people are still suggesting that we should change the behaviour of the language to match the void return type. I.e. make it so that if you try and use the return value from a function that has a return type of void that PHP should blow up. I just strongly disagree with that, I think, returning null so that functions can be chained together. Even if there's no semantically useful information there is preferable to having code blow up through trying to read the result of a function.

Derick Rethans 12:12

Because it's a bit different than in statically typed or compiled languages where you can do all these checks in the compiler right? And never had runtime, whereas in PHP these checks always have to happen at runtime.

Dan Ackroyd 12:23

They do but I think it's at a different level than that it's just does, being able to define the fact that we're reading from a particular function should make the program blow up. Is that a useful thing to do or not? This is quite similar to another discussion that pops up every now and again, of whether to make PHP blow up if too many parameters are passed to functions. There's people who strongly feel that this is a terrible thing to allow, that we need to punish anybody who has extra parameters, being passed around. I actually find having extra parameters be a useful debugging technique very occasionally. Imagine scenarios, in scenarios where you've got an interface that comes from a library that's implemented in 10 different classes in your code, but you want to debug one particular implementation. Just being able to temporarily add on some extra parameters to a method call, and have that just work allows you to do some debugging techniques that just wouldn't be possible if PHP blew up when extra parameters get passed.

This is similar, really similar to the void discussion where people have very strong feelings about, we need to punish people who are writing code wrongly, we need to stop that code from working. The other way that yeah it's not great code, and maybe they might want to refactor their code to not do that, but I can't see any benefit in making PHP blow up.

Derick Rethans 13:49

In my opinion, this is I think that belong in project's coding standards, and their static analysers that they run over the code to make sure that they do all our stylistic choices correct, and not having too many arguments to methods is exactly belongs in that category. Right.

Dan Ackroyd 14:05

I agree completely.

Derick Rethans 14:06

There's a few more things that I'd like to poke your mind about. The mixed type does not include null, is there a reason for that?

Dan Ackroyd 14:14

We discussed this a reasonable amount when drafting the RFC, there's reasons to allow nullability, but what we couldn't see was a clear strong need of why nullability would be required. The mixed type includes null as one of the types and the union of the types of represents. So, adding nullability doesn't actually add any more, more information to the mixed type, because by definition, it's already can be null. It's always possible to add more to PHP core but removing features is really difficult. So we decided to leave it out, for now, just because we can't think of a really strong reason to add it. If someone finds a really clear compelling argument to allow mixed to be nullable, I would definitely be in support of that so long as there was a reasonable reason to have it. What I probably prefer before that, though, is it's kind of odd that the null type isn't usable as a type in PHP by itself. I think that's unfortunate because for union types, imagine you've got some code that can, it's going to return either a float or int, and then you find a reason why it might need to return null. Changing the definition from float or int, to float or int or null, is easier to read for me than question mark, float or int. So I think that might be another RFC that pops up on the radar in the not terribly distant future.

Derick Rethans 15:38

Time is running out for PHP eight little bit of course. So resource is part of mixed, but resource as a type you can't use as a type hint anywhere in PHP. So what's going on here?

Dan Ackroyd 15:51

Resource is more of a pseudo type, then a real type in PHP. It comes from code that was written before PHP even had classes is my understanding. Though obviously that's from the dawn of time so it's hard to figure out where. When people started writing PHP, they used resource, as we use classes now to represent a complicated bit of state that needs to be passed around from one piece of code to another. The problem with resource as a type, is that it doesn't really tell you that much about the type. If something is a resource, it could be a file handle, a curl handle, a GD image, an XML parser, or any of the other things that are called resource types. It's an ongoing piece of work to slowly refactor resource types away and replace them with classes wherever possible. An example of that is the hash context, used to be a resource type in PHP and I think since PHP 7.2 that's been changed to a class. Work's ongoing, and eventually hopefully most of the other resources will go away, and made into more specific types, but in the meantime resource still exists in PHP. The reason that's included in the mixed definition is because it's a reasonable thing to do to pass a file handle around. And so if you've got a parameter type of mixed. It's absolutely fine to pass in a file handle to that piece of code. Excluding the resource type would make the mixed type be too annoying to deal with because your, your code would then deal with all the other types, except resource.

Derick Rethans 17:21

That make sense. As I mentioned in the introduction mixed is already something that's used in a PHP documentation for a long time, and the RFC talks about stubs in PHP. This is something that is going to be introduced with PHP eight as well, what are these stubs.

Dan Ackroyd 17:38

I haven't contributed to any of this work so I apologize to anybody who has been doing this piece of work if I get any of the details wrong. One of the problems with PHP core was that for a long time, the information that was used to generate the reflection information was done on a very ad hoc basis. Some of the information was incorrect, and keeping the reflection information up to date with the actual definitions of how the functions work was annoying, to say the least. It's been an effort by a number of the core contributors to set up a system of file stubs, that allow people to write PHP code that defines a stub for each of the internal functions. So that's just like literally a PHP file that has a stub version of the function that just defines the parameter types, parameter names, and the return types. My understanding is that that information is then used internally by the PHP eight build process to generate the reflection information extract the parameters where appropriate, and could be used for features like named parameters where the name of a parameter in those stubs, the name would be coming from the stub file, rather than some random C file in the middle of the PHP core code.

Derick Rethans 18:53

And the stubs at the moment can't represent mixed. There's still a hold on, with comments.

Dan Ackroyd 18:58

That's correct. This is similar to what I was finding with my own libraries that there were just some things that you just can't currently, add type information for. And it was quite frustrating having to, oh no somebody hasn't missed this one it's just not expressible. Another reason for having mixed is that although generics are going to be still quite a long way off from arriving in PHP. If you wanted to express just a generic array that can contain any possible value. That's another case where the mixed keyword would be used.

Derick Rethans 19:29

I've saw some people ask why mixed was chosen here and not any. Is there any specific reason for that?

Dan Ackroyd 19:36

The very short reason is that it was easier. Mixed has had a mixed concept for multiple decades, mixed is used widely in PHP core code and documentation. It's also used widely in a community for tools like PHP Stan and Psalm where people use mixed in docblocks, or Psalm annotations to indicate any type. It's really widely established. We did discuss, using any instead. It just didn't seem worth the effort of trying to push it through, at least in part because there's so much legacy going on. Also it's just not clearly that much superior to mixed.

Derick Rethans 20:16

Very well. Are there any BC concerns by introducing the mixed keyword.

Dan Ackroyd 20:20

That's a small BC break, you can't use mixed as a class name or function name probably any more, but it's a pretty small one, and anybody using an IDE can just add as using a function called mixed in their code can right click on the function, rename, maybe go and get a cup of coffee if that IDE is slow. There is also tools in the PHP community. This is actually quite a surprising thing that PHP has one of the best refactoring tools out there in Rector. That's a tool that, because it understands the abstract syntax tree of PHP, it can understand that: Oh hey there's this new BC break in the next version of PHP. In this case, if you have some code that had a class name mixed it would understand this is going to break. They provide sets of tools for allowing you to upgrade your code automatically. It's a really awesome tool. It's slightly surprising to me that it's probably like one of the best code refactoring tools, if not the best, in any software language. I've looked at some other language's ecosystems, and I think one of the things about PHP is that because it's actually quite a diverse ecosystem, and people sometimes migrate from Symfony to Laravel, or want to upgrade a PHP 5.6 codebase to PHP seven, or those types of things to value in a refactoring tool is a lot higher. Somebody has gone out and done the work to make that tool, and it's really pretty good.

Derick Rethans 21:46

Sounds like something I should investigate a little bit then, because I actually had never heard of it. Also make sure to either link in the show notes to it. When you're introducing yourself, you mentioned that you're the maintainer of the image magic extension and PHP that you can use to manipulate images. What's going on with this extension? Is there going to be an upcoming release at some point?

Dan Ackroyd 22:05

I want to apologize to everybody for being very lazy and not doing a release, even though there's a small segfault, that happens occasionally, and it's which we have a fix for. To be honest, I don't really use the extension at all myself. And so, maintaining it is more source of stress rather than enjoyment. I know there's many, many things that could be improved for the project including doing releases on a timely basis, and improving the security of how it works, but it's just really hard to justify spending time working on it when it's just a source of stress for me, but it doesn't really provide any benefit to me. As an effort to make it be worth my time effectively or at least give me a gold focus on, I'm going to start asking people to donate money to the projects, to sponsor it, just that I can actually justify myself getting stressed out from trying to help people with impossible to solve bugs that only happened on their system, because otherwise it's just a bit too much stress for me to really want to spend any much, much more time working on it.

Derick Rethans 23:08

Very well, do you have anything else to add?

Dan Ackroyd 23:10

Yes, I have a big request, and you've done this a couple of times during this interview. I'd very much appreciate it if everyone in the PHP community could refrain from using the word hints. When talking about types. It used to be that PHP type system was just hints where yeah the documentation says that this function takes an int, but that was just a hint, and it wasn't really enforced. The type system in PHP has evolved into an actual type system that is enforced at runtime, and although it's not a big deal. It does help when talking amongst ourselves as a community, but also when we're talking to people who don't do that much PHP, who are coming from other languages, where their type system is still just a set of hints. Using a slightly more precise language of the PHP type system and parameter types, return types, and property types. It avoids any confusion about what's actually happening in the engine. And if that is my windmill that I tilt at.

Derick Rethans 24:11

Alright, thank you, Dan for taking the time this afternoon to talk to me. And I will be looking forward to seeing mixed in PHP because it got accepted, just earlier, yesterday I think. And, yeah, part of PHP's improving type system again.

Dan Ackroyd 24:25

Thanks for having me on. It's been a pleasure.

Derick Rethans 24:28

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool, you can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 55: Dealing with Bugs

PHP Internals News: Episode 55: Dealing with Bugs

In this episode of "PHP Internals News" I chat with Ignace Nyamagana Butera (Twitter, GitHub, Blog) about how the PHP project handles bugs and bug reports.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 55. Today I'm talking with Ignace Nyamagana Butera after he'd asked me on Twitter, how PHP deals with bugs. A few episodes ago, I did a Q&A session about the RFC process. And this time again, we'll have Ignace Nyamagana Butera asking the questions. Would you please introduce yourself?

Ignace Nyamagana Butera 0:46

Hello, everyone. Hello, Derick. My name is Ignace Nyamagana Butera, but you can call me Nyamsprod. I've been a PHP developer for around 15 years now. Currently, I'm working as a software developer, and technical lead in the internet content provider agency. When I have free time, I'm doing some open source, I have a couple of projects that you may have heard of, like, league CSV and league URI. I created them and I am currently maintaining them.

Derick Rethans 1:23

Yeah, as I said, it is not me asking the questions as you this time. So I think we should jump straight in actually.

Ignace Nyamagana Butera 1:30

So my first question will be somehow really simple, because we are talking about bugs. And I was wondering if we had some statistics about bugs in PHP.

Derick Rethans 1:44

Though there are some statistics. I mean, it's not really easy to get that information out of our bug system. But just having had a look, it's about on average, maybe one bug a day gets reported at the moment or is nearly 80,000 bugs in the bug system of course, not all of these are closed, some of them are open, but the majority of them are closed.

Ignace Nyamagana Butera 2:07

Do bugs from the EOL PHP still being taken into account or we just say: okay, these bugs for instance, are for PHP five, will no longer look at them.

Derick Rethans 2:18

If it's a bug, unless it's a security bug fix, we won't look at them for unsupported PHP versions. So at the moment, PHP, seven three, and seven four are still supported. So those bugs will of course look at, if it's a security bug, we only will go back to PHP seven two. If it's reported to any older version and seven two for example, seven one or seven zero, or even PHP four or five, which does happen occasionally, we'll tell them to upgrade first because we won't spend time doing that.

Ignace Nyamagana Butera 2:47

Because I manage and maintain open source project. I know that PHP as a language is used everywhere and you can have multiple reports. First thing first, what is a bug? Because there are multiple definition of it.

Derick Rethans 3:03

And I'm sure if you asked 12 people, you get 13 definitions. I think it is unexpected behavior of something that is documented. So if something is documented do this, and it does something else, or it does something really wrong like crash your program, then that will be a bug.

Ignace Nyamagana Butera 3:21

What is the source of truth? Is it the PHP documentation? Is it the PHP specification language, what is the source of truth? Nothing. Okay. This is expected behavior because it is documented, or how does it work?

Derick Rethans 3:38

For most of the syntax, it's what the source does. And of course, you always find edge case. And I don't have a good example right now. For anything that the syntax, I mean, documentation and behavior should absolutely always work the same. If it doesn't, it's likely going to be a bug in the documentation. If you for example, look at other functionality like in an extension, there is almost as likely that the documentation is sometimes wrong than it is that the code's behavior is wrong. In that case, we need to have a good look at what what the expected behavior should have been. Now, with all the new features that have been put in, since we have the RFC process, pretty much anything that the RFC describes how it should work, is how the feature should work. And if it doesn't, that pretty much means there's a bug. Having said that, not everybody writes on all the expected behavior for all the functionality that an RFC has been put up for. And in those cases, you just need to see what makes the most sense whether it's about core feature.

Ignace Nyamagana Butera 4:40

What is the best way to report a bug? Okay, you have to go to bugs.php.net, I suppose. Yes. But apart from that, what is the best way to report a bug?

Derick Rethans 4:51

As you said, PHP is issue tracker is bugs.php.net. It tells you to fill in your problem, your expected behavior and what you actually get out, what is always really important for people to be able to fix an issue and to find out whether there is an issue to begin with, because that's not always the case either of course, is always to have a short reproducible script that reproduces your problem. And by short, that means it the short you can get it. 10 lines at most for most syntax features who probably do the job. In some cases, if it's a bug for a database related system, then of course, there's going to be some database setup necessary for it. But if it's just syntax, then a short script that reproduces the problem that shows what goes wrong, is really important. And of course, it's also important to say what it did, and what you expected it to do. Also, don't lie about your PHP version, because in some cases, people try to report a bug with a higher PHP version than they're actually using, which is kind of frustrating at times.

Ignace Nyamagana Butera 5:52

I guess that yeah, if we report something that didn't work in PHP five, but it was fixed in PHP 7.2 or PHP 7.3 everybody loses a little bit of time.

Derick Rethans 6:02

And in some cases people find a bug report for, say, PHP 7.4.1. Right, and we're currently at 7.4.6. We will always ask them first to upgrade if they can, because upgrading PHP should take a lot less time than trying to reproduce and fix a problem that has already been fixed.

Ignace Nyamagana Butera 6:20

And what is the strategy between the release of each version of PHP and the bug fix? Does PHP wait for all the bug fixes to be done and then a release is made. Or if for instance, I report a bug like today before a release is scheduled, then this bug will be skipped from the next release and will be tackled after

Derick Rethans 6:46

Every minor version of PHP, be at seven two, seven three, or seven four a moment, has a release every four weeks. Two weeks and two days before a release gets made, we make our release candidates. Everything that has made it in the release candidate will make it into the release. If in between the release candidate gets created and the final release, if bugs get fixed, unless they are really critical, they will make it into that release. But we'll have to wait until the next cycle. So we don't necessarily wait for all the bugs to be fixed before we make a release. Now, there is an exception here, and that is for security bugs. If you find security bugs, they don't end up in a normal PHP seven four branch. They get committed to a security repository that very few people have access to. And these security bug fixes. They get merged into the release branches two days before the release comes out. They don't end up in a release candidate builds because we don't want people 16 days to be able to exploit security bugs if they are remote exploitable, for example.

Ignace Nyamagana Butera 7:53

And can security bugs, or critical bugs push a release?

Derick Rethans 7:59

Technically, yes. If somebody ends up finding, like a remote exploitable bug in PHP, then there will be an emergency release for them. But I can't remember the last time we had to do that.

Ignace Nyamagana Butera 8:10

I remember, like one or two years ago, there was a bug that was going from the bugtrack to the internal mailing list and coming back again to the bugtrack, because there was some kind of indecision to know if it is a bug, or if it should be a feature. How is this possible?

Derick Rethans 8:32

We don't really have a set method for doing this. But our bug tracker isn't the most advanced system in the world. And sometimes it just makes sense to trash out a discussion over email on our PHP internals mailing lists, or sometimes these discussions happen on other chat channels as well I'm sure, just to go through to see what's the case. And sometimes if it is hard to take a decision while there's a bug, then it is always a good idea that more PHP core developers have a look at it and see what's going on there. So sometimes it makes it easier if that's discussed on the mailing list, then in the bug tracker.

Ignace Nyamagana Butera 9:04

Is it possible that for instance, someone submit an RFC. And then during the course of discussion of this RFC, it becomes clear that this is not an RFC, but more of a bug fix.

Derick Rethans 9:16

I don't think I can think of an example here actually.

Ignace Nyamagana Butera 9:19

I remember one example.

Derick Rethans 9:21

Okay.

Ignace Nyamagana Butera 9:23

Because I think it was yeah two years ago about the behavior of the CSV escape character. And I remember at some point, it was suggested to be an RFC. And because of the amount of background compatibility breaks, it was better to treat it like a bug. But I remember when between the bug tracker and the note sufficient there was a whole discussion to exactly being able to say: Okay, this is a bug. And this is an RFC and it was really not, it was a call at the end saying, okay, we will treat it like an RFC, and we will change the way the escape corrector works today. But it won't be as impacting as if it was an RFC that introduced a completely new behavior

Derick Rethans 10:12

CSV is a very difficult format, because everybody slightly implements a standard in a different way. And the way how it originally got implemented in PHP for reading CSV files was done in a very different way than for example, what Microsoft products would create. I mean, it has to do with escaping, if I remember correctly. And I mean, what do you decide, right? I mean, since then Microsoft have made a specification for this. And of course, what we then want to do in PHP is to make sure that we support a specification, but by doing so, we will then break previous behavior, and that is always a really difficult decision to do, right. If it is very clear that it is a bug, then we don't mind changing PHP, even though that could technically break people's code. But if it's unsure or whether it's based on a subjective decision, then that makes it a lot harder to write because we can't definitively say that, yeah, we have a bug here. But if we look at other codebase out there, so many people rely on this. So is the old behavior bug, or is it a feature in PHP? I mean, these things, you have to take one by one, and it's very hard to decide on what is what is a feature, and what is the bug in this case.

Ignace Nyamagana Butera 11:22

I think another subject that comes with bugs is people should be able to fix them. But I suppose that every one of us has a work and who can fix those bugs?

Derick Rethans 11:33

Technically, everybody who has time and know C code could fix a bug. PHP is an open source projects. Our repositories are available on GitHub, or on git.php.net, which is our source of truth, although most people submitted bug fixes against the GitHub repository because it makes it easier to review them and comment on pull requests, for example. But it's open for everybody. It's the same thing about triaging bugs. Trying to find out if the bugs that are actually reported are actual bugs and the bugs.php.net website has in the top right hand corner, it has a random link. And if you click that you get a random bug that hasn't been resolved yet. If somebody, if any of the listeners, or maybe you, are interested in looking at these bugs or wanting to attempt to fix them, click random and see what happens. Maybe you get something interesting, maybe because something really complicated, but in any case, it's possible for everybody to fix a bug. They will get reviewed. For a good enough bug fix it will get merged.

Ignace Nyamagana Butera 12:31

People are usually thinking when they think about open source nowadays they think about semver and people may think that if they look at the versioning of PHP, then they have an idea of it is a patch release, it is a bug release, it is a feature release. How is this related to bugs and how is it versioning of PHP working?

Derick Rethans 12:53

PHP's versions number consists out of three numbers. At the moment, we are the latest version is 7.4.6. The six is your bug fix release. In bug fix releases, there will not be any new functionality. Unless there are very minor, small contained parts in extensions. We tend not to want to have these. And unless you can make a good case for it, it's unlikely to happen. But it isn't unheard of. An example I think I can remember is that open SSL, added a bunch of new API's in there, and other technically new function functions in PHP, they sort of had to be supported, because as part of making sure that you could run the latest version of open SSL or something like that, but that being an exception there. Now, the middle number, traditionally, in semver, is there for features, right, you've bump the middle number, the middle digit, if you have new features, and that is the same in PHP. What we don't really have is a major number that indicates that we are going to break things. The major number in PHP is mostly a marketing number. So at the moment, we have PHP seven four out there. We don't have PHP eight zero next. But that is pretty much a PHP seven five, but with additional functionality that we find important enough to bump the major version from seven to eight for. Having said that, we do have a rule that we don't remove functionality, unless we bump the major number. For example, from five to seven, or from seven to eight. So there will be in the course of time, we might deprecate functionality, we don't tend to remove that until we bump the major number. And you also see that if the major number gets increased, that there is potentially more effort in removing or deprecating more functionality that would otherwise do say for example, it changed from 7.3.0 to 7.4.0. But it doesn't mean that we don't bump major numbers so that we can break all the things for example. So I think the PHP protect tries to, we don't always succeed of course, try to never break people's code. Unless it's a bug fix

Ignace Nyamagana Butera 15:03

That was it for my questions.

Derick Rethans 15:06

Maybe I have some questions for you now. I think it is good to talk about these issues. What are you most surprised with in the way how the PHP process handles bugs and bug reports?

Ignace Nyamagana Butera 15:15

The first thing is, like I say, I've been coding in PHP for more than 15 years, but I only started really to report bugs once I start doing some open source project. Because before I think, and I think it's the majority of people, it's like, yes, there is a bug, oh it's something for PHP, or for any kind of language. I'm not the maintainer. So it's a bug, someone else will report it not to me. Since I've changed because I'm doing myself some open sourcing. I'm like, hey, if I found a bug, I think the best way to resolve that bug is first, to report it and to report it correctly, to the project, to the language or to whatever has that bug. And once you've made this change of how you think about the language, then you start to ask yourself, okay, how can I do it the most efficient way so that the bug get reported? And then the bug can get tackled by the people who can.

Derick Rethans 16:19

Yeah, and the start of that, as you say's, always send us a bug report or sent your favorite open source project a bug report.

Ignace Nyamagana Butera 16:26

Exactly.

Derick Rethans 16:27

I can sort of see where you're coming from. Because I can understand that if you're just in an agency, for example, and the only thing, the only thing you have to do is to make sure that your project is done on time. You can't necessarily wait for the bug to be fixed in PHP anyway, because the product needs to be done by tomorrow or yesterday. And you're going to have to find a workaround you issue in that case anyway. And then you spending time reporting the bug will just takes you time and you don't have time for that, for example. But of course, if you do that, then everybody else that runs into this bug will have to come up with a workaround, and that means you're all end up wasting lots of time.

Ignace Nyamagana Butera 17:04

I remember I had a small story. In one of my previous jobs, someone came to me and we're talking about something and he said: Oh, but there is no constant on the DateTimeImmutable. That's very sad. And I said: no, there is because I remember I submitted the bug, and it was tackled. And now the constants are on the interface. So DateTimeImmutable has the constant and was like: Oh, yeah, but I didn't know. And I was; it was reported and someone use it. And if you don't report it, then maybe in two years, you will ask yourself the same question. Indeed, it takes time. Between the moment it is reported the moment it is tacked, because people need to have time to resolve the issue. But if you don't do the first step, which is reporting it correctly, then it will never be solved.

Derick Rethans 17:53

And by correctly that also means doing in the PHP bug tracker and not complaining on Twitter.

Ignace Nyamagana Butera 17:58

Exactly. Exactly.

Derick Rethans 18:02

Of which I see quite a bit of for Xdebug for example. Thank you very much for taking the time to talk to me, or I should say thank you very much for taking the time to interview me to talk about bugs today. I hope you enjoyed this.

Ignace Nyamagana Butera

Thank you for having me. And hopefully we'll meet again.

Derick Rethans

I'm looking forward to that. Thanks very much.

Ignace Nyamagana Butera 18:21

Thank you.

Derick Rethans 18:23

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 54: Magic Method Signatures

PHP Internals News: Episode 54: Magic Method Signatures

In this episode of "PHP Internals News" I chat with Gabriel Caruso (Twitter, GitHub, LinkedIn) about the "Ensure correct signatures of magic methods" RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 54. Today I'm talking with Gabriel Caruso about his ensure correct signatures of magic methods RFC. Hello Gabriel, would you please introduce yourself?

Gabriel Caruso 0:37

Hello Derick and hello to everyone as well. My name is Gabriel. I'm from Brazil, but I'm currently in the Netherlands. I'm working in a company called Usabila, which is basically a feedback company. Yeah, let's talk about this new RFC for PHP eight.

Derick Rethans 0:52

Yes, well, starting off at PHP eight. Somebody told me that you also have some other roles to play with PHP eight.

Gabriel Caruso 0:59

Yeah, I think last week I received the news that I'm going to be the new release manager together with Sara. We're going to basically take care of PHP eight, ensuring that we have new versions, every month that we have stable versions every month free of bugs, we know that it's not going to happen.

Derick Rethans 1:17

That's why there's a release cycle with alphas and betas.

Gabriel Caruso 1:20

Yeah.

Derick Rethans 1:21

I've been through this exactly a year early, of course, because I'm doing a seven four releases.

Gabriel Caruso 1:25

Oh, nice. Yeah. So I'm gonna ask a lot of questions for you.

Derick Rethans 1:29

Oh, that's, that's fine. It's also the role of the current latest release manager to actually kickstart the process of getting the PHP, in this case, PHP eight release managers elected. Previously, there were only very few people that wanted to do it. So in for the seven four releases it was Peter and me. But in your case, there were four people that wanted to do it, which meant that for the first time I can ever remember we actually had to hold some form of election process for it. That didn't go as planned because we ended up having a tie twice, which was interesting. So we had to run a run off election for the second person between you and Ben Ramsey, that's going to go continuing for you for the next three and a half years likely.

Gabriel Caruso 2:11

Yep.

Derick Rethans 2:12

So good luck with that.

Gabriel Caruso 2:13

Thank you. Thank you very much.

Derick Rethans 2:15

In any case, let's get back to the RFC that we actually wanted to talk about today, which is the ensure correct signatures of magic methods RFC. What are these magic methods?

Gabriel Caruso 2:24

So PHP, let's say out of the box, gives the user some magic methods that every single class have it. We can use that those methods for anything, but basically, what magic methods are are just methods that are called by PHP when a given action happens to the class. So for example, if a class is being constructed, then the construct magic method is going to be called. If I'm calling serialize function, then the magic method serialize as per PHP seven four or PHP eight. I don't remember, so this is basically what magic methods are, are methods that PHP hook into the classes and then once a certain action happened with the class, then PHP is going to call those magic methods in something magic, so to speak is going to happen.

Derick Rethans 3:13

And other options are like underscore underscore get, and underscore underscore set.

Gabriel Caruso 3:17

We have, we have a lot.

Derick Rethans 3:19

Exactly, what do people tend to use these magic methods for?

Gabriel Caruso 3:22

So that's something interesting. As the magic method is called by a number of actions we can use, for example, for let's let's get the example of ORM for example, Doctrine or Eloquent or whatever one. Let's say I'm a maintainer of that library. I don't know what fields do you have in your database. So when I'm porting, when I'm doing the translation, what it can do is map in a property, all those columns and values that I have in the database. And then when you instantiate your entity and you try to access a variable that is does not exist, then we're going to go to a magic method in this case is get, as I said, and I'm going to say okay, is not set in the class, but is mapped in the entity that I have. So this is one case, we also have the case for testing your you have, for example, the famous PHP Unit test framework, every time that a test case is called with all those methods is starting in with test, the call magic method is invoked. And then you can perform whatever action you have. You also have middlewares and the examples go go even further

Derick Rethans 4:32

In the title of RFC you have the word signature, what is the signature?

Gabriel Caruso 4:37

All the attributes that our method can have. So for example, the name of a method is its signature, what does it return? What parameters does it take? And also what modifiers so for example, is it static or not? Is it public, private or protected? So all this information together in usually is one line in PHP. So for example, private static MyMethod, that receives a string and returns a Boolean. There you go. This is the signature of my method

Derick Rethans 5:06

Because some of these magic methods have been in PHP for a long long time. Back in the time where we didn't have argument types or return types or perhaps not even static. All the way back from the past PHP hasn't really done anything with signatures because they've simply didn't exist. At the moment which signature checks this PHP already do?

Gabriel Caruso 5:26

I don't remember a by the RFC but I think was introduced together with the scalar type RFC. But only constructors and destructors until PHP seven four, those two only magic methods were being checked. If they have none return type, not even void, just no return type. But in PHP eight, we're gonna have the new stringable interface and then every single toString magic method. If it is typed, this is very important if it is typed it needs to be a string and these are the only from the 17 that we have only three in PHP 8 are being checked.

Derick Rethans 6:01

PHP seven four.

Gabriel Caruso 6:02

Yeah, in PHP seven four only two and then PHP eight, we have the new toString.

Derick Rethans 6:07

But this RFC suggesting to change that of course.

Gabriel Caruso 6:10

yeah.

Derick Rethans 6:11

What's the reason why you want to extend these checks to the other magic methods?

Gabriel Caruso 6:14

That brings me back how I figured out that. I was looking at some bugs, because we have the https://bugs.php.net, where we centralized all the bugs of PHP. Then there is a bug report explaining in complaining exactly about that. Like, I can't hide my magic method. Back in the days I can say, for example, that my tostring method is going to return an integer or a Boolean. That makes no sense. And then I was like, yeah, makes makes no sense. We need to fix that out and then I start to search how do we type that? How what types do we have and then I was like, we can't in PHP eight, because this is going to be a new major version. So we are allowed to at least vote for do that. We can check if someone is using types, we can check those types. We are not going to force, we are not going to require, we're not going to evaluate even run static analysis. Nope, we're going to simply check. Okay. Are you saying that this get magic method is going to return anything? Okay, that's okay. Oh, but I want to my guess is that you specifically return a string. That's also okay. As to how to pronounce that liskov mistook principle, right?

Derick Rethans 6:36

The liskov substitution principle.

Gabriel Caruso 7:26

Yeah. And so this is what we're going to basically do with this RFC, there's going to be voted. We're going to simply check if you're using the right types, because, in my opinion, magic methods are a foundation in PHP. As we have theses methods across different code bases across different projects from different behaviours, at least when I'm looking at that code. Okay, I'm looking at this magic method. I know what parameters does it take. I know what return does it have. This is worth less tab to the bug are trying to understand what is happening. Because today maybe I'm debugging a toString method there is return an integer. And I'm like, okay, this is the bug, it's supposed to return a string. But once you ensure those all those signatures, is one less bug that we're gonna have in production.

Derick Rethans 8:17

When are these signatures being ensured?

Gabriel Caruso 8:19

It's not at compile time because he does not have a compile time. But he's when the Zend machine is compiling the code, we have a very specific method that is checking all the modifiers. So for example, the signature that we mentioned before so all the magic methods needs to be public. This has been checked, for example, they callStatic magic method needs to be static. So this has also been checked. And then I'm extending how do we check for signatures for param types and also for return types. So during compilation of the Zend VM.

Derick Rethans 8:52

Taking as example callStatic in the RFC, I see that the name has to be a string and the arguments has to be an array. What happens if you use a different type there?

Gabriel Caruso 9:01

So nowadays if you use a different type that's allowed. So if you say there, you're going to receive an integer, and you're going to receive a string. This is allowed today. And this is what I mentioned about when you are debugging or analyze different code bases, you're going to be like why in the documentation says that we need to receive a string and an array, and there's this specific code base is receiving a string and an integer. So this is what kinds of mismatch I want to avoid. Of course, when using types, because we also know that PHP in some projects does not use types. And that's perfectly fine. If you're not using types, I'm not going to ask you, hey, you need to type those magic methods. Well, what I'm going to do is okay, you're using types and I need to make sure they're using right otherwise this is going to be a mess.

Derick Rethans 9:47

If you type it; say use an integer for the name of underscore underscore get, will give you a warning or a compile error, or parse error? What what kind of feedback which you get back from that?

Gabriel Caruso 9:59

While you are running your code, as soon as that class get referenced, we're going to check. Is not when is initiated, when is not when is called, as soon as I think the autoload detects that class is gonna parse, is going to identify, and then is going to compile and during the compile time that we mentioned. We're going to identify that. So it's going to be early in the stages. Perhaps as soon as you run something or you would upset me, you're going to have that feedback saying: hey, this is not compatible with what we are expecting.

Derick Rethans 10:32

Is that a warning or type error?

Gabriel Caruso 10:34

It's going to be a fatal error, because this is what we are constantly returning with the destructors and constructors.

Derick Rethans 10:41

Yeah, we alluded to mixed already a little bit and the RFC mentioned mixed a few times, of course mixes in the type and PHP yet. So what do you want to do about that?

Gabriel Caruso 10:51

Today we are 11th of May of 2020. Right now we have an RFC voting in PHP to introduce the mixed type. I'm not going to say if I agree or disagree, it's being voted. If that RFC gets accepted then I have already talked with the authors of the that RFC, I'm going to wait until they merge into master. I'm going to rebase and readapt to my RFC, to have those mixed types. And there we go PHP eight probably can have mixed, and probably can already have the usage of mixed in the magic methods. So either No, I'm gonna need to wait for the end of their RFC. If it's approved, there go I need to rebase my PR. In the other case, we are going to keep as comments because we can't ensure that in the compile time with the VM.

Derick Rethans 11:41

At the moment, it looks like that vote will and in May 21. The current votes are 35 to six for passing. So it looks like that will go through

Unknown Speaker 11:50

And then I need to rush because we have the upcoming feature freeze of PHP eight. So I need to make sure that I start to vote and implement my RFC before that time.

Derick Rethans 12:00

Feature freeze should be by the end of July. So I think you have plenty of ime pfor that. And of course you have a release manager, you can make an exception. That's how that works. Usually adding extra checks will have impact to existing code. Is there much impact to existing code here as well?

Gabriel Caruso 12:18

That was the interest question that I made myself. Okay, I'm going to touch the magic methods of PHP. I'm going to break some code in an issue identified those breaking changes in an each map in the RFC. How do I map across many projects, many libraries, many PHP codes out there? How do I do that? I remember that Nikita back in his RFC about the parenthesis origin, like how do we present this ordering and yada yada yada. He made a script, where he went through I think was the top thousand or top 10,000 packages. On packagist, that is the official composer package provider and he identified everything, and ask myself how he did that. And actually was very easy. He just cloned other repositories. He instantiate a new PHP parser instance that is his magic parser. That is behind PHP Stan, is behind psalm, is behind a lot of infection, a lot of big projects, where you analyze the code. So you have a code base where you can analyze and say: Do I have magic methods wrong? And then I run this script, identify, I think six or seven types that were not perfect. Three of them. I have already submitted a request because we're in PHP Unit and I said to Sebastian: hey, this actually is not right. Because I'm proposing this RFC, he was like: Okay, perfect, let's merge it. And the other cases are the cases that I mentioned. For example, with get. Get, you need to return mixed but by the LSP, you can nail down to an integer or a string. So there you go, at least in the top 10,000 packages of composer is not going to be a breaking change. But of course, it's going to be breaking change for people that I can't map. So this is why it's mentioned the RFC that if you're using types with magic methods wrong, we're going to warn you.

Derick Rethans 14:13

But at least it's an easy thing to check for. Because even running all your files through PHP minus L should catch it.

Gabriel Caruso 14:20

Yeah, there you go.

Derick Rethans 14:22

So it's a very easy to check for something. You provided a link to Nikita's script where he checks for those ternairies, do you have a version of your own script available as well?

Gabriel Caruso 14:33

That's interesting. I thought the RFC was updated. So I'm going to update the RFC, because I do have the script locally.

Derick Rethans 14:39

Then I can link to it for the podcast as well.

Gabriel Caruso 14:41

Okay, perfect.

Derick Rethans 14:42

In the future, are you thinking of extending checks to a few more things?

Gabriel Caruso 14:46

So this is something that I fought about this RFC, like how much you want to break and explode people's code. And I think starting with checking types in the signature is the first step. The next step is to actually check the return type. We do that with toString. So for example, although you have type right for maybe, some logic or something is wrong, you're returning an integer. There is a check before the actual type saying you're supposed to return a string you're return an integer. And actually, there is a check in the magic method saying this magic method was supposed to return a string. I think is gonna break even more code because then it's something that I can't measure. So I was like: Okay, let's first start with types and then we can give it next step that is: okay, inside this method, what is being returned, okay, is something different from the signature: explode. You're returning something that I was not supposed to return. But this is not a fight that I'm going to pick. So I leave it up for the next major version of PHP or whatever.

Derick Rethans 15:49

Wouldn't PHP's strict versus weak type mechanism already catch these things. So from debugInfo, if you would type that as returning an array, and then you end up returning an object, which is not necessarily wrong, just not what you expected. PHP's return type checking mechanism should already catch that for you.

Gabriel Caruso 16:13

If you have a magic method typed. If it's not typed, so we can say that some efforts do have that check. And then we're going to expand when we don't have types in the signature.

Derick Rethans 16:24

That's clear now. Do you have anything else to add?

Gabriel Caruso 16:27

The only thing that I want to add that is, I have created another RFC, and this is something that I always tell everyone that is easy to do; is not impossible. Anyone can go there, identify a bug or catch a bug report and then try to fix it. And this is what I'm doing. Like I'll do them to release many of PHP eight. I'm also fixing bugs, improving documentation and everything else. This is something that I try to do and share with everyone. So everyone can also be the next one contributor to the to PHP and it's evolution.

Derick Rethans 16:57

This RFC isn't out for voting yet. You set you want to sort of wait until mixed gets passed or not. What's the reception been so far?

Gabriel Caruso 17:05

So I asked a couple of key members of the PHP community, both internal and external people. They agree, they said that the right approach is to first check for the signature, because if someone is already using types, that project is type friendly, so we can at least play with that. But if someone is not typing, then this is a bigger fight. And then we're going to talk about that in the future.

Derick Rethans 17:29

Thank you, Gabriel for taking the time this morning to talk to me. I've learned a few more things about this RFC, so that's always good to know. And again, congratulations of being the PHP eight release manager together with Sara.

Gabriel Caruso 17:41

Thank you very much. Also thank you for inviting me for this new podcast is amazing. Always listen to all these famous people of PHP that talked with you. And I'm like, Whoa, Derick has invited me this is going to be so much fun. Thank you very much.

Derick Rethans 17:55

Thanks for listening to this installment of PHP internals news, the weekly podcast dedicated to demystify the development of the PHP language, I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to Dderick@phpinternals.news. Thank you for listening, and I'll see you next week.


PHP Internals News: Episode 53: Constructor Property Promotion

PHP Internals News: Episode 53: Constructor Property Promotion

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about the Constructor Property Promotion RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 53. Today I'm talking with Nikita Popov about a few RFCs that he's made in the last few weeks. Let's start with the constructor property promotion RFC.

Nikita Popov 0:36

Hello Nikita, would you please introduce yourself? Hi, Derick. I am Nikita and I am doing PHP internals work at JetBrains and the constructor promotion, constructor property promotion RFC is the result of some discussion about how we can improve object ergonomics in PHP.

Derick Rethans 0:56

Object economics. It's something that I spoke with Larry Garfield about two episodes ago, where we discuss Larry's proposal or overview of what can be improved with object ergonomics in PHP. And I think we mentioned that you just landed this RFC that we're now talking about. What is the part of the object ergonomics proposal that this RFC is trying to solve?

Nikita Popov 1:20

I mean, the basic problem we have right now is that it's a bit more inconvenient than it really should be to use simple value objects in PHP. And there is two sides to that problem. One is on the side of writing the class declaration, and the other part is on the side of instantiating the object. This RFC tries to make the class declaration simpler, and shorter, and less redundant.

Derick Rethans 1:50

At the moment, how would a typical class instantiation constructor look like?

Nikita Popov 1:55

Right now, if we take simple examples from the RFC, we have a class Point, which has three properties, x, y, and Zed. And each of those has a float type. And that's really all the class is. Ideally, this is all we would have to write. But of course, to make this object actually usable, we also have to provide a constructor. And the constructor is going to repeat that. Yes, we want to accept three floating point numbers x, y, and Zed as parameters. And then in the body, we have to again repeat that, okay, each of those parameters needs to be assigned to a property. So we have to write this x equals x, this y equals y, this z equals z. I think for the Point class this is still not a particularly large burden. Because we have like only three properties. The names are nice and short. The types are really short. We don't have to write a lot of code, but if you have larger classes with more properties, with more constructor arguments, with larger and more descriptive names, and also larger and more descriptive type names, then this makes up for quite a bit of boilerplate code.

Derick Rethans 3:16

Because you're pretty much having the properties' names in there three times.

Nikita Popov 3:20

Four times even. One is the property name and the declaration, one in the parameter, and then you have to the assignment has to repeat it twice.

Derick Rethans 3:30

You're repeating the property names four times, and the types twice.

Nikita Popov 3:34

Right.

Derick Rethans 3:36

What is the syntax that you're proposing to improve this?

Nikita Popov 3:39

The syntax is to merge the constructor and the property declarations. So you only declare the constructor and you add an extra visibility keyword in front of the normal parameter name. So instead of accepting float x in the constructor, you accept public float x. And what this shorthand syntax does is to also generate the corresponding property. So you're declaring a property public float x. And to also implicitly perform this assignment in the constructor body. So to assign this x equals x, and this is really all it does. So it's just syntax sugar. It's a simple syntactic transformation that we're doing. But that reduces the amount of boilerplate code you have to write for value objects in particular, because for those commonly, you don't really need much more than your properties and the constructor.

Derick Rethans 4:40

Besides public, I suppose you can also use protected and private there as well.

Nikita Popov 4:45

That's right. So you can use all the visibility modifiers. Well, public protected private, static does not really make sense. But if we add other modifiers in the future, then those could be used there as well for example, if we add support for read only properties, then of course, you could also write public readonly float x or something.

Derick Rethans 5:09

The RFC talks about desugaring. How's this implemented? Is this transformation on in the AST, or in another way?

Nikita Popov 5:17

This is not an AST transform, but I would say close enough. So we just generate the corresponding property declarations and assignments in the compiler. If you inspect the AST with an extension like PHP AST, you will see the code as written. So with the public in front of the parameter name, but if you inspect the code in reflection, then it will look as if you declared the property explicitly.

Derick Rethans 5:48

So the RFC talks about a few constraints and what you can and cannot do with those promoted properties. One of the things it talks about is nullability.

Nikita Popov 5:58

Well, we have two different nullability semantics in PHP for historical reasons. One is in parameters, where we say, if you use a type that is not explicitly nullable, but you have a null default value, then we make the type implicitly nullable. While for property types, which are newer, we no longer have this implicit behaviour. So if you want to have a nullable property, you do need to explicitly mark it as nullable. Just using a null default value on will result in an error. And the handling is the same here. So if you want to have a nullable promoted property, you have to mark it as nullable

Derick Rethans 6:43

And you cannot just rely on setting the default to null?

Nikita Popov 6:46

Exactly, but I think it's like detail. And really this could go either way. I just prefer the explicit nullability because this seems like the direction we are going to in the future. I don't know if we will ever remove this implicit behaviour. Maybe not. But I think nowadays explicit one is preferred.

Derick Rethans 7:10

Less magic is better.

Nikita Popov 7:11

Less magic, exactly.

Derick Rethans 7:13

The RFC also has like constraints in there. You can also define a constructor in traits and abstract classes. Can you also use a constructor property promotion there as well?.

Nikita Popov 7:23

In traits? Yes, I mean in traits, using it will be a little bit weird. But there is no reason why it can't work. After all traits can have a constructor that will be used in the using class. And traits can also have properties that get imported. So the same mechanism works there as well. It does not work for abstract constructors or constructors in interfaces. The syntax also implies that you have some assignments inside the body of the constructor, and if we have an abstract constructor, then we could not emit these assignments anywhere. We could support it as a special case, like saying that it only declares the properties but skips those assignments. But I know how often you've used abstract constructors, I probably used them like maybe once or twice in all my time working with PHP. So either they really need extra support in that area.

Derick Rethans 8:25

It would also then introduce an inconsistency were promoted properties in abstract classes or abstract class constructors if that's the thing, would be different from normal class constructor property promotion. How does the inheritance work? Is the working in the same way or is there no specific difference in it?

Nikita Popov 8:44

Based on like discussion feedback, I think inheritance is the largest point of confusion with this syntax. The thing is that does not really have any special interaction with inheritance. So you can just follow this like syntactical transformation it does, which does not have any impact on inheritance. But the thing is, if you just look at the code, and you see you have the parent class defining the constructor, and the child class defining the constructor, and then you're wondering, well, is there some kind of connection between the parameters? The promoted parameters declared in one constructor and the other one? And the answer is simply: No, there isn't. Those have nothing to do with each other. And even more generally, constructors are a bit of a special case where inheritance is concerned. So usually, we say that methods always have to be compatible with the parent method. So the signature has to be compatible, the return type has to be well not match, but be contravariant. And similar for the argument types, but this rule does not apply for the constructor. So the constructor really belongs to a single class, and constructors between parent and child class do not have to be compatible in any way.

Derick Rethans 10:09

Are there any types that you can't use for constructor property promotion?

Nikita Popov 10:14

Just callable. Because callable is not a valid property type. Well, there is one more thing that you can't use a variadic argument. Well, if you write a variadic argument, you write something like int, dot, dot, dot, whatever. But the type you're actually writing is int, because that's the type of each individual argument. But all of that gets collected into an array. So the type of the corresponding property would have to be array. So we would have to do an extra transform that's maybe not super obvious. And so I've left this part out.

Derick Rethans 10:50

And also PHP's type system doesn't support defining an array of integers. It only supports describing an array. At a time we're talking about is, at the end of April, this hasn't gone up for a vote yet. When do you think this will happen?

Nikita Popov 11:05

The RFC will need one small adjustment because the attributes RFC is currently in voting and it very much looks like it's going to be accepted. We will need to also consider support for attributes on the promoted properties. I think the only small question there is, what does the attributes apply to? Because this could apply to the parameter or to the property, or both.

Derick Rethans 11:34

How would you actually set these attributes because from what I understand docblocks, you can only use in front of a method name or a property declaration. How would you define a different attribute for each of the promoted properties?

Nikita Popov 11:48

I believe that the attributes RFC already supports attributes on parameters, so that shouldn't be a problem.

Derick Rethans 11:55

So it allows for setting a specific attribute for each of the arguments coming into the constructor. But that didn't quite answer the question. When do you think we'll be voting on this?

Nikita Popov 12:05

Maybe in a week or so.

Derick Rethans 12:06

By the time this podcast comes out?

Nikita Popov 12:09

Well, we have had a lot of activity recently in PHP internals. So I guess we are one of the few places that benefit from the Coronavirus, because people now have time to work on PHP.

Derick Rethans 12:24

Yeah, I mean, I'm looking at so much extra code now. Interestingly, when going to the RFC, and as a side note, it mentioned somewhere that when defining more properties, the line length goes too long, because you now have this extra keyword in there. And that could benefit from then separating the constructor arguments over multiple lines. And that that raises the point is that you can use a trailing comma in arrays when you call functions, but not in argument lists. And I saw that you've also made another RFC for adding the trailing commas in the parameter lists.

Nikita Popov 12:58

So there's like a super simple RFC, just allow that extra comma. This has actually already been discussed a couple of times in the past, and has not, has been declined that point.

Derick Rethans 13:13

I'm just having a quick look at it. Because this RFC is already voting to see what the current votes are, and it's 58 for and one against.

Nikita Popov 13:21

I think like the main counter argument people have against this kind of trailing comma stuff is, well, doesn't that mean that it encourages writing methods with a lot of parameters, which is a bad style. I don't think it does. And I think that even if you don't have a lot of parameters, it's fairly easy to run into line length limitations, because nowadays like to use expressive long parameter names, and expressive long type names, so even without adding an extra protected in front of all of that, you can really easily get signatures that split across multiple lines. In which case having the trailing comma is nice, mainly because we already write it everywhere else.

Derick Rethans 14:12

Except for in arguments to methods, because you can't.

Nikita Popov 14:17

Well, there are also a couple of other places where you can't. For example, like if you have a class implements, and then implements many interfaces, then you can't put a trailing comma after the last interface. And this is something we could also allow. But I think the relevant distinction there is that this is kind of a freestanding list. Um, it's not wrapped inside brackets, or parentheses. So it kind of looks a little bit weird if you have a trailing comma there, which is possibly also why previous RFC on that simply allowed trailing comma everywhere did not pass.

Derick Rethans 14:58

As I said, it looks likely that will pass.

Nikita Popov 15:01

Yes, I think it's unlikely that we're going to get 13 new no votes.

Derick Rethans 15:07

What I also find interesting is that an RFC that you've mentioned earlier in the episode is that attributes are going to pass as well. At the moment, there's only one no votes there as well, which surprised me because the last time attributes was discussed was very much not going to pass whatsoever.

Nikita Popov 15:27

Yeah, this is an interesting effect. It's hard to say why it happens. Probably, well, part of the reason is just that issues that were raised on previous proposals have been addressed. For example, the last one by Dmitri had the very controversial aspects where it's exposed the AST. The abstract syntax tree representation of the attributes, which has gone from this one, and thus removes one of the contentious issues. But I think another part is just that sometimes it takes multiple proposals to really get an idea through internals. We have this situation pretty commonly that though the first RFC fails, second RFC fails, and then the third one does pass.

Derick Rethans 16:18

It's also it's taken five years or so. And people's opinions might just change about these things.

Nikita Popov 16:23

Exactly. The previous proposals might just have been before their time.

Derick Rethans 16:29

I saw you had made one other tiny RFC, which is the stricter type checks for arithmetic slash bitwise operators. What is that about?

Nikita Popov 16:40

Very simple. So if you're write, well, x minus y, and x is an array. And y is a resource, like what do you expect the outcome to be? There is really no reasonable way that can work. So this RFC proposes to make the arithmetic and the bitwise operators, when working on arrays, when working on objects, and working on resources, simply throw an exception. And the motivation for that was the operator overloading RFC, which has in the meantime been declined. But still, this was a concern raised there that while you can overload operators for objects, but you still get pretty weird behaviour if an overloaded operator is missing, because we currently handle that with just a otice and assuming that the object is equal to one, which is usually not a useful or desired behaviour.

Derick Rethans 17:39

There is of course, one exception where you can still use an arithmetic operator, which is the plus between arrays.

Nikita Popov 17:46

That's right, yeah. So array plus array is similar to an array merge operation. And that one is of course, well defined and remains supported

Derick Rethans 17:55

Whereas things like true divided by 17, although not sensible, it'll continue to work.

Nikita Popov 18:00

Right, that also. Yeah, so because this is simply a much more contentious issue whether, like implicitly treating true as one is a good idea or not. Personally, I know I have written code where I, for example, add up booleans. Just as a count of how often something is true. This is like maybe maybe, style wise it would be better to write an explicit integer cast. But the code is also not really wrong. This may be as a discussion for another time.

Derick Rethans 18:33

As we've said before, the smaller the RFCs, the easier it is to get them passed as well. Alright, Nikita, thanks for taking the time this morning to talk to me about constructor property promotion RFC, and a few others. We'll see whether they get passed for PHP eight.

Nikita Popov 18:48

Thanks for having me Derick, once again.

Derick Rethans 18:52

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.