Streaming, working from home, and trapping mail

Jake and Michael discuss their foray into streaming, what it's like working from home (and staying there!), and some of their favourite tooling for local mail testing.

This episode is sponsored by Fathom Analytics and was streamed live.

Show links



PHP Internals News: Episode 47: Attributes v2

PHP Internals News: Episode 47: Attributes v2

In this episode of "PHP Internals News" I chat with Benjamin Eberlei (Twitter, GitHub, Website) about an RFC that he wrote, that would add Attributes to PHP.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 47. Today I'm talking with Benjamin Eberlei about the attributes version 2 RFC. Hello, Benjamin, would you please introduce yourself?

Benjamin Eberlei 0:34

Hello, I'm Benjamin. I started contributing to PHP in more detail last year with my RFC on the extension to DOM. And I felt that the attributes thing was the next great or bigger thing that I should tackle because I would really like to work on this and I've been working on this sort of scope for a long time.

Derick Rethans 0:58

Although RFC startled attribute version two. There was actually never an attribute version one. What's happening there?

Benjamin Eberlei 1:05

There was an attributes version one.

Derick Rethans 1:07

No, it was called annotations?

Benjamin Eberlei 1:08

No, it was called attributes. There were two RFCs. One was called annotations, I think it was from 2012 or 2013. And then in 2016, Dmitri had an RFC that was called the attributes, original attributes RFC.

Derick Rethans 1:25

So this is the version two. What is the difference between attributes and annotations?

Benjamin Eberlei 1:30

It's just a naming. So essentially, different languages have this feature, which we probably explain in a bit. But different languages have this. And in Java, it's called annotations. In languages that are maybe more closer home to PHP, so C#, C++, Rust, and Hack. It's called attributes. And then Python and JavaScript also have it, that works a bit differently. And it's called decorators there.

Derick Rethans 1:58

What are these attributes or annotations to begin with?

Benjamin Eberlei 2:01

They are a way to declare structured metadata on declarations of the language. So in PHP or in my RFC, this would be classes, class properties, class constants and regular functions. You could declare additional metadata there that sort of tags those declarations with specific additional machine readable information.

Derick Rethans 2:27

This is something that other languages have. And surely people that use PHP will have done something similar already anyway?

Benjamin Eberlei 2:35

PHP has this concept of doc block comments, which you can access through an API at runtime. They were originally I guess, added as part or of like sort of to support the PHP doc project which existed at that point to declare types on functions and everything. So this goes way back to the time when PHP didn't have type hints and everything had to be documented everywhere so that you at least have roughly have an idea of what types would flow in and out of functions.

Derick Rethans 3:07

Why is that now no longer good enough?

Benjamin Eberlei 3:09

Essentially, user land developers use doc blocks to put metadata in there, and you could access them through an API. We had two sort of standards, or we still have two standards that use this. The documentation standard coming from the PHP documentor community. And then mostly runtime use case that exists now is covered by the doctrine annotations library, which, incidentally, I have also worked on a lot. It is used, for example, by the Symfony community, by the Drupal community, and by a few other communities as well that are smaller that wanted to go into the direction of using annotations in this case or attributes.

Derick Rethans 3:53

What would doctrine use an annotation for?

Benjamin Eberlei 3:55

I said before that annotations, add metadata to declarations. So let's say you have in your code, for example, classes that you want to store in the database. So you need to map PHP classes to database tables and back. Usually, you would do that using some kind of configuration. And configuration can be many folds. So the easiest way would be to write this in PHP, say, this is the column name, this is the field name, this is the class name and then store and use this information. And then you can go and store this in ini files, yaml files, XML files. The problem with this kind of approach is often that you have the configuration file and you have the class, and they are totally separate from each other, usually in very different places of the codebase. This is not some kind of configuration that is fluid. It's very, very static configuration that depends on the class. And it will not really change unless the class also changes. So changes are usually done together. In this case, it might make sense to put the configuration on to the class. Because then you see the declaration, you see it's configured in some way. And then you can more easily understand that changes affect each other in some way. And it leads to less mistakes, in my opinion. And it makes it a little bit more obvious that the class is used in some configured way.

Derick Rethans 5:26

We've had a quick look at what annotations are. The RFC introduces them in a different way, the attributes that you're not proposing, how are they different from the doc block comments?

Benjamin Eberlei 5:37

The idea is that we introduce a new syntax that is independent of the doc block comments. Essentially, before each declaration, you can use the lesser than symbol twice, then the attribute declaration, and then the greater than sign twice. This is the syntax I've used from the previous attributes RFC. And Dmitri at that point used the syntax from Hack. And it makes sense to reuse this not because Hack and PHP are going in the same direction any more. But because Hack at that point they introduced it that they had the same problems with which symbols are actually still easy to use. And we do have a problem in PHP a little bit with the kind of sort of free symbols that we can still use at certain places. And lesser than and greater than at this point are easy to parse. There are a bunch of alternatives and one thing that I will probably propose is an alternative syntax where we start with a percentage sign, then the square bracket open and then a square bracket close. This is more in line with how Rust declares attributes. While Rust uses the sort of the hash symbol, which we can't use because it's a comment in PHP.

Derick Rethans 6:54

And you don't want to use emojis.

Benjamin Eberlei 6:55

Some crazy people propose to use emojis which would easily work in PHP, but I guess it would be hard to remember the number to get the Unicode sign.

Derick Rethans 7:06

Within the two opening lesser than signs and two greater than signs to close it. What's in the middle?

Benjamin Eberlei 7:12

You declare an attribute name. And then you sort of have a parenthesis open, parentheses close, to pass optional arguments. You don't have to use them. So you can only use the attribute name. If you sort of want to tag something: just this is a validator, or this is an event listener, whatever you come up with, to use attributes for. But if you need to configure something in addition, then you can use. The syntax sort of looks like if you would construct a new class, except that you don't have to put the new keyword in front of it.

Derick Rethans 7:45

It looks like function arguments pretty much.

Benjamin Eberlei 7:47

Yes, exactly. Yeah.

Derick Rethans 7:48

What kind of values can you use in the optional arguments to the attributes?

Benjamin Eberlei 7:53

The attributes are not really runnable code in a way. Since they are declarations, they don't allow arbitrary PHP code to run there. What is obviously allowed a simple literal values, so a number, or a fixed string, a fixed array declaration, and all this kind of things are possible. What is also possible is exactly the same expressions that you can also declare in class constants. So, in the class constants, you can do simple mathematical expressions, you can reference other constants. So, this is something that will be very interesting for attributes to do reference class names for example.

Derick Rethans 8:34

What happens if you define an attribute on a declaration element?

Benjamin Eberlei 8:38

What happens is that while the PHP script gets compiled, it will see that there are attributes declared and it will parse the attributes and similar to the doc block store them on the internal structure for future reference. Attributes are parsed in my current proposal in a way that you can have every attribute just once. This is something that is still under heavy discussion, because there are a few good ideas why you would need two, or multiple. Essentially similar to how a doc block is a string, we then store an array, which represents the attributes belonging to the class or the function or the constant. And this is something that the engine stores and also stores it in OPCache.

Derick Rethans 9:27

How would you access these attributes?

Benjamin Eberlei 9:28

Attributes are accessed through the reflection API. The reflection API also allows access to doc blocks. For attributes that would be a new function called getAttributes(). And it returns a list of all attributes using a new reflection class called ReflectionAttribute. There you can access what name does this attribute have? What are the arguments that are passed? And then this goes into one of the next features of this RFC proposal. You can also ask it to return this attribute as an object instance.

Derick Rethans 10:05

An object instance of which class though?

Benjamin Eberlei 10:07

Attributes, and this is something that is different to the initial version, the version one attributes RFC is, attributes names resolve to class names. That means if you declare an attribute, for example, Foo, and you have an import for our class, MyApplication/Foo, then during passing the attribute will be resolved to my attribute view name. It uses the same mechanism for class resolving that is used in every script. It reflects the use statements that are declared in the file. And you can use namespaces, namespace operators to reference the attributes as well.

Derick Rethans 10:49

These are attributes not classes, so I don't quite see all the link between the attribute names in the classes is?

Benjamin Eberlei 10:55

One problem with the original doc block based system was that there are conflicts between attributes of different systems. One library would have a type annotation, or a var annotation, and some other library would also use it. This could lead to conflict if the syntax for them was slightly different. So this would lead to problems when multiple parses would use the same attribute. And they would parse them differently. And this could lead to errors. One problem that was mentioned in the initial attributes RFC and that, I think, if you vote us all so used as a reason for voting no is that there was no namespacing, which means that different libraries could clash and their use of attributes. My idea was we already have classes, we have namespacing. We can resolve this by using this mechanism. You declare an attribute and an attribute always resolves to a class. In the best case scenario, you would also declare this class in your code. Essentially, the attribute is not an attribute, but it's a special class that represents an attribute. This is also shown in the code that by having an additional interface, or a sort of a marker interface, that attributes can implement to make it obvious that they they are used as an attribute.

Derick Rethans 12:19

You mentioned that you could access the attributes through reflection API, and you can get them out as an object?

Benjamin Eberlei 12:25

Yes, this is why I mentioned before that the syntax sort of looks like constructing a new object, but without the new keyword. When you access the objects through the reflection API, it would essentially instantiate the class, and all the arguments that you put into the attribute declaration are passed into the constructor of the object. And this is why the connection is there between a class and an attribute. It directly goes to instantiating the attributes as an object using the arguments and giving the developer access to them.

Derick Rethans 13:00

Does it only do something like this when you use the getObject() on the reflection arguments? Or is it also possible that I don't care about these classes things whatsoever, and I can just get a list of attributes and their optional values that are associated with them?

Benjamin Eberlei 13:16

You don't have to have a class, and the class name resolving in PHP is independent of classes actually existing. The attributes RFC respect that. You can just import anything that is not a class and use an import statement to shorten the attribute usage, or you can use the absolute namespace syntax to put a fully qualified attribute name into your code. And it wouldn't fail. The fail would only happen when you call the method on ReflectionAttribute to get the attribute as an object. So this is something the RFC is also in flux with and about to change it. The first version mentioned that attributes will always be auto loaded when they are declared at compile time. This would essentially treat attributes similar to base classes or interfaces, in a way that they are always resolved, they're always checked. However, this is a little bit overkill for userland attributes. And a lot of feedback was related to this should only happen when the reflection API is used. So I'm going to change this. One thing that we do need to handle in a way is a built in attributes. One reason why I want to add this RFC as well is that there are a few use cases coming up in PHP itself, that could benefit a lot if we had built in attributes. Since we don't have a clear path forward there. But Nikita has published his ideas on editions. So there's some paths forward to having PHP code work slightly differently depending on what developers want. Attributes could be helpful there. Other things for example, the JIT. JIT has features where you can at the moment use doc block comments to declare methods as always JIT-able or never JIT-able. Dmitri used doc block comments to check for JIT or no JIT tag in there. This is essentially something that attributes should be used for because should be machine readable. Then there's a lot of other stuff that for example, Rust also put forward that PHP is struggling with: conditional declarations of functions. For example, Symfony has a polyfill library that adds functions that are in higher languages, re implements them in a way that they're also available in lower versions where they don't exist in core. There are a lot of hacks around the sort of conditional declaration of functions and classes and stuff that make it difficult for OPCache to actually cache the files. I believe there are also even more problems if you use these kind of fights with pre loading. Essentially what could be done with attributes would be something like conditionally declared as function only if it's on PHP 7.3 and lower something like this.

Derick Rethans 16:13

You just mentioned using JIT or no JIT as an annotation. Does that also mean that extensions have easy access to these attributes?

Benjamin Eberlei 16:21

OPCache's not a PHP core functionality. It's still its own extension. The idea is that extensions have access to attributes in a very simple way. So there will be a Zend API, sort of an internal name for an API that the Zend engine provides to extensions and extensions will be able to access attributes and make decisions based on this. Extensions can already hook into the compile step of PHP and there's a hook called zend_ast_process. During AST processing, you can do stuff. That would be one way to, for extensions to look at attributes and maybe change code if they want. Then the engine obviously has tonnes of other hooks where the declarations are available in the data structure that the Zend engine provides. So there's zend_class_entry, for example, where you could look into the attributes as an extension and make decisions.

Derick Rethans 17:20

This is a pretty new RFC, and hence there're always going to be few open issues. Because we like to argue about stuff. What are the open issues on this RFC?

Benjamin Eberlei 17:29

This is the seventh RFC on this topic. So there has been a lot of discussion. I guess this feature is, in a way quite controversial because of the implementation details. A lot of my work now will be to find the best implementation that can actually make this feature part of core by getting enough votes for it. And so I gathered a lot of feedback from the community; also talked a lot to contributors. Changes that I will be probably doing is allowing multiple attributes. What I said before, the auto loading has to be clarified. There has to be some distinction between internal attributes and user land attributes in a way that doesn't require auto loading. Hack, for example, has __ as a magic prefix, which I want to avoid, because it puts up all this magic methods, sort of argument back on the table. We need to have something to make a distinction between userland and internal attributes, because the internal attributes need to be validated very strictly at compile time. And the userland attributes need to be validated only when you call the getAsObject() method on the reflection API.

Derick Rethans 18:42

How long do you think there'll be before you put this RFC up for a vote?

Benjamin Eberlei 18:46

It's a bit tricky because this issue is so controversial. I don't want to invest month of work and then get a no vote. And so I do want to have some feedback quite quick enough. I do realise that the first draft needs some work and clarifications that would otherwise lead to no votes from contributors. So I hope to get this done in, let's say, two to four weeks of additional work.

Derick Rethans 19:09

All right, Benjamin. That was a great explanation of the attributes version two RFC.

Benjamin Eberlei 19:16

Thank you for having me, and I really appreciate it again.

Derick Rethans 19:21

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


Magento’s Evolution, Ecommerce, Development Environments, and Enterprise Software.

In Episode 30

In this episode, we dive into How Magento is Evolving and chat with Magento evangelist Ben Marks.

Topics

  • The free magazine issue courtesy of Adobe and Magento.
  • An interview with Ben Marks on how Magento went from an open-source project to its acquisition, the value of the community and ecosystem, how Magento 2 changed the landscape, how to get started working with the platform, and the future.
  • Approaches for updating legacy codebases.
  • Using asynchronous processes.
  • How stepping away can help you when you’re stuck on a problem.
  • Options for setting up development environments for a project.
  • What exactly the term “Enterprise Software” means.

The post Magento’s Evolution, Ecommerce, Development Environments, and Enterprise Software. appeared first on php[architect].


PHP Internals News: Episode 46: str_contains()

PHP Internals News: Episode 46: str_contains()

In this episode of "PHP Internals News" I chat with Philipp Tanlak (GitHub, Xing) about his str_contains() RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 46. Today I'm talking with Phillipp Tanlak, about an RFC that he's made titled str_contains. Phillipp, would you please introduce yourself.

Philipp Tanlak 0:35

Hey, Derick. My name is Philipp. I'm 25 years old and I live in Germany. I work for an IT service company, which does mainly development and maintenance of IT projects. We specialise in the maintenance of e-commerce website and create enterprise applications.

Derick Rethans 0:52

How long have you been using PHP for?

Philipp Tanlak 0:54

I've been using PHP for quite a long time now that might be six years I guess.

Derick Rethans 0:58

What brought to you creating an RFC?

Philipp Tanlak 1:02

The main reason I've created this RFC was out of necessity and interest, mainly to scratch my own itch.

Derick Rethans 1:08

That is how most things make it into PHP in the end isn't it?

Philipp Tanlak 1:11

Yeah, I guess.

Derick Rethans 1:12

The RFC is titled str_contains, that tells me something that is about strings and containing things. How do we currently find a string in a string?

Philipp Tanlak 1:22

The current approach to find the string in a string is to use the strpos() function or the strstr() function. But on Reddit, I found someone also use preg_match which I find kind of interesting.

Derick Rethans 1:35

There are multiple amount of different methods in use, what are the general problems with these approaches that people have made?

Philipp Tanlak 1:41

So the current approach which I find is not very intuitive, and mainly because of the return values of these functions. For example, the strpos() returns either the position where the string is found, or a false value if the string is not found, but there has to be a check with a !== operation, and the strstr() function just returns a string. So you have to convert that to a boolean to check if the string is found or not.

Derick Rethans 2:11

Because with strpos(), if you wouldn't use the === or !== operator. Of course, if it would find it at the first position of the string, it'd be zero position, and it would return false, even though it's sfound it.

Philipp Tanlak 2:26

Yeah.

Derick Rethans 2:27

So there's a few different problems with these things. Also, I don't think it's particularly vary intuitive to do because you sort of need to come up with like a whole construct to see whether it's part of a string.

Philipp Tanlak 2:37

Correct. I don't think it's intuitive for a beginner. So if someone is learning PHP for the first time, then he has to search through the documentation, what are the exact return values for these functions, and has to remember that so I thought, string or str_contains() might be a better fit for that to just return a true or false value.

Derick Rethans 2:58

We've mentioned str_contains() a few times now, I guess the RFC is producing to add this function. How would this function differ from what PHP already has?

Philipp Tanlak 3:07

So this function does not differ in a lot of ways. It's basically the same implementation of the strpos() function. But instead of returning the position of the found string, it just simply returns it as a boolean value. So either true or false.

Derick Rethans 3:23

I can imagine some people will say, well, you can just do this in your own wrapper function, right? Because pretty much what it deos is converting the results from strpos() to a boolean. But you must have a good reason of why to want to add an extra function here.

Philipp Tanlak 3:38

The reason for this function, and maybe someone might disagree is, mainly a user experience for the developer. So this is just out of necessity which I found, and I've been using this function quite a lot. So I thought this might be a valid add to the PHP language. So I tried to implement it and it got some great reviews. So I thought that wasn't a very bad idea I had.

Derick Rethans 4:04

Is the RFC suggesting just out a single function: str_contains().

Philipp Tanlak 4:09

Yes, the RFC is currently adding just a single function, which is the str_contains(). When I first submitted the discussion about this RFC, there were quite a few people asking why is there no case insensitivity or multibyte versions for these, and I did not think of those at first. But in the discussion, it became clear that the multibyte version did not seem to be very necessary because the comparison is going to be byte by byte. Unlike strpos(), the position of the found string is not relevant. So it doesn't matter if there is any difference in encoding.

Derick Rethans 4:47

I remember in last year, there was another RFC related to strings functions they were the string_starts_with() and a string_ends_with(). Those are two functions and there were also variants for both case insensitivity, ss well as multibyte. Which made eight different functions to be added to pretty much do a single thing. That RFC failed, potentially because there are so many things being added.

Philipp Tanlak 5:11

Yeah, that was also the main reason, I think the case insensitivity of this function, or the variant of it was not so relevant. So I did not include it into the RFC just because of this case you mentioned. So instead of polluting the global space with more functions, someone suggested to just advance PHP incrementally and add in case sensitivity for this function just if it is necessary.

Derick Rethans 5:37

This is a common recurring subject. Most of the people I spoke with in the last few episodes are all adding things to PHP bit by bit instead of coming up with big RFCs which I think is a good way of going forwards. When reading the RFC, I had a quick look at which argument the function would accept. PHP of course this weakly typed strings in most of time. Is this str_contains() function handling distinct different from what strpos() does for function arguments.

Philipp Tanlak 6:10

So the str_contains() function uses the same internal function, which is php_memnstr(), if I recall correctly. It tries to interpret it as a string. And if it's not a string, it either throws a warning or notice, but I've just run some checks and it seems like in the next PHP version, non string values which are passed into the string functions will be interpreted as a string, and if that is not the case, it will throw an error or usually return false.

Derick Rethans 6:43

So it doesn't do any special magic, and just relies on the PHP tends to do for parsing arguments and weak and strict typing.

Philipp Tanlak 6:51

Yes, that's correct.

Derick Rethans 6:53

Most RFCs they come with a patch, as does yours. How did you find it getting started with writing things for PHP instead of using PHP.

Philipp Tanlak 7:02

So basically, I've looked at the PHP source code in the past, just to see how things are implemented. And I had some basic background in C. So I thought that this was not very hard for me. Most of the functions or things I had to do to include this patch, were already there. So basically, I just copied the strpos() function and remove the, when the string is found, use the position to calculate a new string and just remove that code and return the boolean value from the found position.

Derick Rethans 7:35

Because it is not a very different function from strpos(), it's just pretty much a different return type. It's a lot easier to do.

Philipp Tanlak 7:44

Yeah.

Derick Rethans 7:45

When looking at feedback, what were the main criticisms of this?

Philipp Tanlak 7:48

The main criticism of this was basically just the variants of these functions. So mainly the multibyte variant or the in case sensitivity. Other than that, the response was very, very nice and, and also very rewarding for me. So I thought I did a good job on this. And many people wanted to have this function in PHP, but either did not have the time to implement it or it was too easy. I'm not sure how that went. But I think the response from the devs and the overall PHP community was very nice.

Derick Rethans 8:23

The RFC is already in voting, so I'm I'm a bit late to talk about them. Usually I'm and things are still in discussion. And at the moment, it looks like it is passing because the votes are 43 to 6 with another weeks ago, then.

Philipp Tanlak 8:37

Yeah.

Derick Rethans 8:37

Do you think this will be your last RFC? Or do you have something else in mind?

Philipp Tanlak 8:41

At the time of this recording I don't have anything else in mind, but maybe if I find something. Since I'm working with PHP on a daily basis, which I think is worth adding to PHP I might create a new RFC.

Derick Rethans 8:54

That's how I started and see what happens now. Thank you for taking the time to talk to me today Phillipp, I hope you enjoyed this.

Philipp Tanlak 9:01

Yeah, thanks for having me Derick.

Derick Rethans 9:05

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.


Hoarding toilet paper, project health, and staying home

Jake and Michael share how their lives have changed in the two weeks since the last episode thanks to the state of the world, and brainstorm how they plan on managing project health in their upcoming SaaS, thenping.me.

Jake and Michael discuss vehicle insurance, health insurance, validation outside of HTTP requests, event sourcing, and more!

This episode is sponsored by Fathom Analytics; the simple, privacy-focussed analytics for bloggers & businesses and was streamed live.

Show links


PHP Internals News: Episode 45: Language Evolution Overview Proposal

PHP Internals News: Episode 45: Language Evolution Overview Proposal

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about the Language Evolution Overview Proposal RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 45. Today I'm talking with Nikita Popov yet again about a non technical RFC that he's produced titled language evolution overview. Somewhere last year, there was a big discussion about P++, an alternative ID of how to deal with improving PHP as a language but also still think about how some other people already use PHP and I don't really want to change how they currently use PHP. Like then I didn't really have an episode about that because I'd like to keep politics out of this podcast, or definitely PHP's internals politics. I do think that we realised at that moment that something did have to happen, because there's not really policy about when we can add things, when we can remove things, and so on. So I was quite pleased to see that you have come up with a quite wordy RFC, not talking about anything technical, but more looking forward of were will see PHP in the near or medium future, I would say. What are your thoughts about making this RFC to start with?

Nikita Popov 1:29

As you mentioned we had some pretty, let's say heated discussions last year, concerning especially backwards incompatible changes. So there were a number of very, very contentious RFCs. One of them was the short opentags removal, and another one was the classification of undefined variable warnings. So whether those should throw or not throw, and well basic contention is this that PHP is a by now pretty old language, 25 years old. And we can all admit that it's not the language with the best design. So it has evolved relatively organically with quite a few words, and the famous inconsistencies. And now we have this problem where we would like to resolve some of these long standing issues. Many of them are genuine problems that are introducing bugs in code, that reduce developer productivity. But at the same time, we have a huge amount of legacy code. So there are probably many hundreds of millions of lines of PHP code. And every time we do a backwards compatibility break, that code has to be updated, or more realistically, that code does not get updated and keeps hitting on old PHP version that, at some point also drops out of security support. And now the question is how can we fix the problems that PHP has, while still allowing this legacy code to update their PHP version. The general idea of how to fix this is to make certain backwards compatibility breaks opt in. By default, you just get the old behaviour, but you can specify in some way, exactly how it's done doesn't really matter at this point, that you want to opt into some kind of change or improvement.

Derick Rethans 3:34

As one example being the strict types that have been introduced in PHP that you need to turn on with a switch with a declare switch.

Nikita Popov 3:42

Strict types is really a great example because it has the important characteristic that has done per file. So you can turn on the strict types in one file and not affect any other code, at least in theory. So there are some edge cases, but I think like mostly you can just enable strict types in your library and you don't affect any other library that the project uses. We would like to extend this concept. It should be possible that libraries can update to your language, well, it's called language dialect without forcing other libraries or without forcing the using codes to update as well. Because this is what we have to do right now, though, before you can update your project to PHP eight, let's say, you first have to wait that all the libraries you're using update to PHP eight. And maybe there are libraries that are going to update but also say that: Okay, now actually PHP eight is required. And then you kind of get these complex dependencies with libraries supporting these versions and not supporting those versions, and doing updates becomes pretty hard. As I said, the idea is to make the these backwards incompatible changes opt in some way, and there are multiple general models. So as you mentioned, P++ is the most radical approach. It's more or less a separate language but sharing the same implementation. And as the name suggests that this is inspired by C and C++. So those are usually implemented in the same compiler. And they can be interoperable in a limited way, mostly in that you can use C code inside C++ easily. Using C++ code inside C code tends to be much harder. Yeah, P++ is, I think the option we are pretty unlikely to take for a couple of reasons, because it's this kind of one time huge break which first means that we only have one chance to get it right, and given all the track record, we should maybe not rely on that. Also means that the upgrade becomes especially hard because you have to do everything at once. It's not spread out over a longer time.

Derick Rethans 5:54

You say that we need to get it right in one go, but that is hard to say because you don't know, in the future what else we want to add? Like the RFC mentions a few few other cases, like, for example, things like forbidding dynamic Object Properties, we'd have to do right away now as well, if he'd go with the two languages one implementation phase, right? I mean, if we hadn't thought about it, nobody would have thought about it after the split as we made, we'd still not be able to do it.

Nikita Popov 6:20

That's true. So P++ is, one time, one time solution. It doesn't really scale over time. I mean, there are also other concerns. And I think like in the end, one of the big ones is just that we don't have the resources for it anyway. So we have only maybe three full time developers on PHP. And I don't think we want to start focusing on this huge separate language more or less. Now we're just going to take a couple of years. Next to having this entirely separate language, there are two other ways to approach the problem. One is editions, which is a concept used by the rust programming language. The idea there is that next to the version, which is more or less than implementation version, you also have this edition, which is a completely orthogonal concept. Basically, we will say: okay right now we are for example at edition zero. And then in addition one you opt into some kind of set of backwards incompatible changes. Then in addition two, there are more backwards incompatible changes, and so on. Each edition is essentially a superset of the previous one.

Derick Rethans 7:32

Would it also mean you couldn't get new features in a new edition or is it purely about making backwards incompatible changes?

Nikita Popov 7:40

So, this is purely about backwards compatibility. So, if a new feature can be added without breakage then should always be available. The editions switch would only control the backwards incompatible parts. This is to contrast with the second approach, which is to have fine grained declare statements. As you already mentioned, we have the existing strict types directive and we could continue down the same path. So, we could add new declare for no dynamic Object Properties equals one, and then for a strict operators equals one, and for whatever else equals one. And then you would have this long list of possible declares, with which you could enable or disable some particular bit of language behaviour.

Derick Rethans 8:26

Then I can imagine that in another five years, that list might be 20 options long.

Nikita Popov 8:31

Right. So, the concern there is of course, one part is maintenance, because we have to support basically an exponential combination of different options. And the other is from the programmer perspective, that the like mental model becomes more complicated because you have to keep in mind like which exact set of declares am I using right now? I should say, though, that this model is actually used by Python. Because Python has this import or use from future feature. So there is basically this magic module __future from which you can import language features that will become the default in newer Python versions. For example, you can import the new integer division behaviour inside an older version. This is more or less the same as doing the declares, the fine grained declares, just with a different syntax and with the I think, stronger focus that the behaviour is going to become the default in the future version.

Derick Rethans 9:38

So basically, you're opting into experimental functions really?

Nikita Popov 9:41

Could be either experimental functions, or it could be really functions from newer versions. In particular Python, also for a while had parallel development of Python 2 and Python 3, in which context this probably makes more sense.

Derick Rethans 9:56

There's pretty much three options that the RFC mentions: a new language common implementation or the PHP / P++ option, the editions, and the fine grained declares. These are all still going to be based per file?

Nikita Popov 10:12

So that's the second large question, what is the general model? And the second one is where we declare it. The approach I was initially pursuing was to have this declare it at the package level. So for a whole library or for for a whole project.

Derick Rethans 10:32

How would you define what a package is?

Nikita Popov 10:33

We have namespaces. And there is a somewhat loose coupling between namespaces and packages. So I have an old RFC for a namespace scope declares, where you could, for example, specify strict types for whole namespace, which is, I think, maybe the most natural way to treat packages right now, because this is the closest thing to a package we have. Fortunately, it does have a few issues. One of them is that this namespace package mapping is not always there. So there are packages that have some somewhat odd nesting of name spaces. And I've also heard that some people, for example, define their models inside the Doctrine name space, because they're, you know, extend their classes. So they also put them the namespace. Of course, you shouldn't do that. But it's things that could happen, because we don't really have this enforcement that the namespace really is a package. And then there are also technical concerns, because right now, namespaces are really just a compile time thing to handle name resolution, and now they kind of turn into a feature that also has some kind of runtime impact. And you have to consider things like what happens if you have multiple namespaces in the same file, and also other considerations, like what happens if the names namespace is first used, and you issue some namespace scope declares afterwards. All that can be resolved, but it makes the model somewhat more complicated.

Derick Rethans 11:53

And I guess you end up having to declare these namespace scope declares maybe in a separate file or something like that?

Nikita Popov 12:14

At least what I have in mind that is that you would declare them in composer.json, and Composer would then take care of registering them with PHP itself. Of course, you could also do that manually, which are not using Composer but that at least was the 95% use case.

Derick Rethans 12:31

In applications that make use of Composer, it is very likely that Composer knows about all the libraries that a specific application uses, and hence will be able to construct an array, where it can tell PHP by calling a function declaring all the different options or editions of whatever that end's up being.

Nikita Popov 12:49

So that's one of the approaches. There are also some alternatives. One is to instead introduce an actual package concept. One of the possibilities is to basically: add an extra line to each file, which says package and the package name. So that really removes any and all ambiguities. But you do have to add that extra line, which serves some very limited purpose. And basically only for these package scope declares, could maybe also be used for some extra features, like, package private symbols.

Derick Rethans 13:23

But it would also instantly make that code base non-parsable with older PHP versions.

Nikita Popov 13:28

That's also true, right. But that's a general problem that most approaches I think, would have. So namespace scope declares is one that doesn't have it, but even the per file approach would have this problem because if you write for example, declare edition, then you would right now on PHP seven get the warning that the edition declare is not known. Yeah, last variant that I'm discussing here is to make packages based on the file system, which is something many other languages do. So you have some kind of magic file somewhere that says okay, this directory and all the sub directories are part of the package. In PHP, this kind of file system based approach is somewhat problematic, because our include mechanism is not really based on the file system but on fairly general stream abstraction. You can include from the file system, you can include, if you're really crazy from HTTP, but you can also include from Phar files, from an input stream, or from some kind of custom defined stream. These file system based packages require some additional operations to be well defined. So they have to have a notion of path canonicalization so you can determine whether a file is inside the directory, even if there are things like symlinks or the file system is case insensitive. Which does exist for the file system. So we have the real path syscall, but doesn't exist for streams right now. And a similar problem is that we need to be able to walk up from a path to the directories. And that's also something that doesn't exist for streams. And like more generally, not all streams really have a well defined concept of a directory. For example, if you are reading a file from stdin, so the stdin or the input stream, then there is no directory and like, which package is that going to be in?

Derick Rethans 15:31

I think it would be hard to end up debugging at some point. So why some things don't actually end up being in a package where you expect them to be, for example. And then on top of that, you also need to define: Well, how do I call this file and things like that, right? I mean, a PHP script wouldn't be just a single file, for example, would be a single file and this extra definition file. And that's the concept of course that we don't have in PHP at all. Everything is on profile pretty much.

Nikita Popov 15:56

Which is why at least to right now. I think, like the immediate way forward, is to use per file declares. So if we don't use the fine grained declare approach, and instead have a single edition, then it's not really a problem to put the declare edition inside every file, because this is already what we do for strict types. It's like not super ergonomic. But I think it's also not a huge problem. And it does have the one very big advantage that files are and remain self contained. So you don't have to consult an external definition that may be hard to locate to figure out how to process.

Derick Rethans 16:36

And every IDE or tool would have to implement that same logic and make sure that it's all consistent with each other as well.

Nikita Popov 16:43

I wouldn't say it's really hard, but it might be somewhat fragile, especially when it comes to convention. I said if we put things in composer.json, there's probably something tooling can easily deal with. But if you then encounter a project that doesn't use Composer and uses as some other way to register the package declares, then you might run into problems.

Derick Rethans 17:09

Lots of things to talk about and discuss at some point. As you submitted this RFC to the mailing list some time ago now, what is sort of the feedback that you're getting on this?

Nikita Popov 17:19

So I think the general direction, at least this pretty clear. Most of the discussion is focused on the addition concept, not the finger in declaratives, or the P++. I think for now, we would also go with the per file approach. Now, the main two points that remain contentious is: first, how does the support timeline look like? So basically, the concept of editions just enables different libraries to upgrade independently. That's the core premise. But at least in Rust additionally editions of are also guaranteed to be supported forever. So you can leave your old code running on the old edition, and you do not have to ever update it.

Derick Rethans 18:10

How often do they make new editions? Every three years?

Nikita Popov 18:13

Yeah, it's not quite clear yet, but probably it's going to be every three years. And now for us, the question is, well, do we want to support old editions forever? Or do we want to give them a finite lifetime? Say we introduced a new edition in PHP eight, and then we supported until PHP nine. That means code can take its time to do the necessary updates, but it does have to do the updates at some point.

Derick Rethans 18:37

But you'd have five years?

Nikita Popov 18:39

It's more of the general question of if it's forever or if it's limited. So I think based on the discussion, there is a pretty strong preference to not support them forever.

Derick Rethans 18:51

But for how long then? I mean, it must be longer than what we support a normal PHP version for, right?

Nikita Popov 18:56

Yeah, would expect it to be something like a major version cycle. The second question is related to the strict types, as you said, strict types is like an existing example of a mechanism that works like this. And now we're introducing a second mechanism with the same basic characteristics. Are we going to merge them or not? Would we say that, in the new edition that strict types is enabled by default, or even always enabled? If we do that, and we say that additions have limited support life, that means that strict types is going to become the only option in the future at some point, at least. You can imagine that this is somewhat contentious because there are quite a lot of people who consider weak types to still be the superior option.

Derick Rethans 19:49

Whenever I go speak at conferences or user groups, that's not the case. One question is, which keeps recurring always is: Why isn't this the default in PHP eight? I think there's an expectation that strict title at some point is going to be turned on by default.

Nikita Popov 20:04

Yeah, and the thing, this is where people disagree whether this expectation is this or not. So there are plenty of people in the discussion thread, well, by plenty I mean, at least two, who strongly think that strict types should remain an option. I mean, PHP of deals with often deals with input coming from HTTP or from a database which is usually coming in as a string. And they think that the typecast you have to do to make that work with strict types actually kind of weaken the type safety guarantees, because if you perform an explicit cast, then that cast is performed basically without any checks. So you can like take a completely non numeric string cast it to integer and you will get zero without any warning or whatever. While even in weak typing mode, that would still result in an error.

Derick Rethans 20:58

It's a curious thing actually when you mention databases because, of course databases, you've defined very strict types for your data in them. It's just that it's interesting that PHP's interface to most of these old SQL databases, just decided to always turn into a string.

Nikita Popov 21:14

It's it does actually support returning things in they're like native type.

Derick Rethans 21:20

With PDO, yes.

Nikita Popov 21:21

But under options, and I think it's also like dependent on whether you do emulation or not, and stuff like that. And you have all these different drivers that have differing support for that. But yeah, to get back to strict types, but one of the options is to really keep editions and strict types separate, and also evolve the strict and the non strict mode independently. So you could say that in the new edition, the strict typing mode becomes stricter, for example, by also extending to operators, arithmetic operators, not just to function arguments, but that of course doesn't mean that: Yeah, we saying strict types of states exist forever as a separate track of language.

Derick Rethans 22:06

Yeah, that's an interesting one. I'm not sure how to get to a conclusion there actually. Because there's always going to be people on each side side.

Nikita Popov 22:13

Yeah.

Derick Rethans 22:13

Would you think that this language evolution overview proposal would have been decided on which way to go by the time feature freeze for PHP eight comes around?

Nikita Popov 22:23

I think it would be pretty good to have this for PHP eight, because well, it's new major version and the time to introduce this kind of concept. I should say, though, that we already have quite a few backwards incompatible changes in PHP eight, and at least some of them are, like, we are definitely not going to retrofit them into the editions concept. So there are already certainly going to be breaking changes there.

Derick Rethans 22:52

Why wouldn't you retrofit them? I mean, if we end up deciding a PHP eight will have these editions, would they not be part of that or would they always end up breaking anyway? Because it seems like a sort of an ideal place to then do it.

Nikita Popov 23:05

And yeah, problem is just that the there are some quite extensive changes, especially when it comes to warnings versus exceptions, and will just be like a lot of efforts to get this under an edition flag and to support both behaviours there. Maybe some of the existing changes could be moved into there, with not a huge amount of effort. But I think there are definitely going to be some like hard edition independent breaking changes.

Derick Rethans 23:37

New major PHP versions still might have some backward breaking changes independently from when we do the editions or not, or more declares or not?

Nikita Popov 23:46

Yeah, that's like one more question, what exactly is the scope of editions? What goes into the edition, what doesn't go into there? I mean, there is always a cost to ending something with this mechanism. One is just maintenance for us. And of course that like user has to consider more different versions of the language. And I think one particularly large aspect that would likely never fall under edition concept is changes to the standard library. So additions work well for language changes, but I don't think they really make sense for a standard library changes. So everything that involves depreciations, or functions with eventual removal would not be covered for that.

Derick Rethans 24:31

Do you have an example of such a change in the standard library that PHP eight might have?

Nikita Popov 24:36

What I just said might as the general that, usually in every PHP version, we deprecate a bunch of functions and are going to remove them at some point. And these deprecations are like going to apply independently of what edition you set. Actual changes in terms of like real behaviour changes of the standard library I think that's something we quite rarely do. Actual changes to the standard library where the behaviour of a function is changed. That's something we generally try to avoid. Specifically because this causes relatively subtle backwards compatibility breaks. So usually we will either do changes by introducing a new flag or a new function, or by deprecating the functionality entirely. Even when it comes to language changes, there is like I know one example. And the discussion was, well, if we had the edition concept, and we wanted to introduce something like traits, the trait functionality in general is not backwards compatibility breaking. But the trait feature does introduce two new reserved keywords, which is trait and insteadof. So there is technically a backwards compatibility break even though it's finer. And now you have the trade off. Do you introduce traits in the new edition and only reserve the keywords there, thus removing any backwards compatibility break. Or do you you introduce it always, which means that everyone can benefit from it, even if they haven't updated the code to the new edition yet. But it does introduce the small backwards compatibility break. And then you get this trade off and the discussion what you should be doing about that.

Derick Rethans 26:17

I think making that kind of decisions will have to be done based on evidence. And I think in the past you've used the top thousand projects on GitHub and see whether things break or not to make a decision. For example, having the nested, or the triple, quadruple nested ternary. Anytime people use it, it's pretty much a bug in the code.

Nikita Popov 26:36

Yeah, so to give one example, in PHP 7.4, we introduced the short closure syntax with the fn keyword, and they're the source code analysis showed that basically, fn is not used outside of tests, apart from one library, which is my own. Which does have quite a few dependencies. And that library was indeed broken essentially completely by that change. So in that case, I think there might have been an argument that this feature should be introduced under an edition, because there is like evidence of actual breakage in the wild.

Derick Rethans 27:14

This is one of us trying to get it right. We now have evidence for it.

Nikita Popov 27:18

And probably like the insteadof keyword for traits, that there's much less problematic.

Derick Rethans 27:24

Again, as I say, it's the data that speaks that there right? That was quite a bit to go through. I'm curious to see where those discussions ends up going. Hopefully, we get to a conclusion somewhere in the next few months and ready for PHP 8.0. Who knows? Maybe we have another podcast episode where we introduce a new editions concept.

Nikita Popov 27:43

So this is probably my most vague RFC, with a somewhat unclear goal and the somewhat unclear discussion outcome.

Derick Rethans 27:53

Do you have anything else to add to this discussion that we've missed?

Nikita Popov 27:55

I think there is just one thing maybe worth mentioning, which Rust uses pretty extensively, which has automatic upgrades. So they have some tooling to do that, which is mostly reliable. And I think it would be pretty nice if in PHP, we had something similar. In PHP, we can't really make this reliable because language is just way too dynamic. And we actually do have some tooling in the form of the rector library. But we might want to think about providing something under the PHP project umbrella that is more geared towards like doing updates that are as safe as possible. So you can run them without thinking but still reduce your loads some what.

Derick Rethans 28:40

And that is something that is definitely for the future. Thanks for talking to me about the language evolution overview proposal.

Nikita Popov 28:46

Thanks for having me, Derick.

Derick Rethans 28:53

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP line. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.