225:Tatted Up

This week on the podcast, Eric, John, and Thomas talk about face tattoos (yeah you read that correctly), when do you open-source code, PHP 8.1, Livewire, and more...

Links from the show:

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.

Twitter Account https://twitter.com/phpugly

Host:

Streams:

Powered by Restream

Patreon Page

PHPUgly Anthem by Harry Mack / Harry Mack Youtube Channel


Mezzio framework, functional programming, software dependency security

Listen as the podcast crew got together to go through the February 2021 issue, Dealing with Data.

Topics Covered

  • Mezzio framework from Laminas (formerly Zend Framework)
  • Functional programming and where to use it in PHP applications.
  • Using CQRS and Event Sourcing to manage input and output data flows.
  • How to approach the February PHP Puzzle
  • Diving in to just use Docker.
  • Considering the security of external packages and libraries.
  • Using poker hand rules as an example for defining business requirements.
  • Drupal 9 and the conclusion of Eric’s interview with Angie Byron.
  • Using PHP’s Stream functions to transform incoming request data.
  • Finding Big Data to work with on a project.

The post Mezzio framework, functional programming, software dependency security appeared first on php[architect].


PHP Internals News: Episode 77: fsync: Buffers All The Way Down

PHP Internals News: Episode 77: fsync: Buffers All The Way Down

In this episode of "PHP Internals News" I chat with David Gebler (GitHub) about his suggestion to add the fsync() function to PHP, as well as file and output buffers.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:13

Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is Episode 77. In this episode I'm talking with David Gebler about an RFC that he's written to add a new function to PHP called fsync. David, would you please introduce yourself?

David Gebler 0:35

Hi, I'm David. I've worked with PHP professionally among other languages as a developer of websites and back end services. I've been doing that for about 15 years now. I'm a new contributor to PHP core, fsync is my first RFC.

Derick Rethans 0:48

What is the reason why you want to introduce fsync into the PHP language?

David Gebler 0:52

It's an interesting question. I suppose in one sense, I've always felt that the absence of fsync and some interface to fsync is provided by most other high level languages, has always been something of an oversight in PHP. But the other reason was that it was an exercise for me in familiarizing myself with PHP's core getting to learn the source code, and it's a very small contribution, but it's one that I feel is potentially useful, and it was easy for me to do as a learning exercise.

Derick Rethans 1:16

How did you find learning about PHP's internals?

David Gebler 1:19

Quite the roller coaster. The PHP internals are very arcane I suppose I would say, it's it's something that's not particularly well documented. It's quite an interesting challenge to get into it. I think a lot of it you have to pick up from digging through the source code, looking at what's already been done, putting together the pieces, but there is a really great community on the internals list, and indeed elsewhere online, and I found a lot of people very helpful in answering questions and again giving feedback when I first opened my initial proof of concept PR

Derick Rethans 1:48

Did you manage to find room 11 on Stack Overflow chat as well?

David Gebler 1:52

I did not, no.

Derick Rethans 1:53

I'll make sure to add a link in the show notes and it's where many of the PHP core contributors hang out quite a bit.

David Gebler 2:00

Sounds good to know for the future.

Derick Rethans 2:02

I read the RFC earlier today. And it talks about fsync, but it also talks about flush, or f-flush. What is the difference between them and what does fsync actually do?

David Gebler 2:14

That's the question that will be on everyone's lips when they hear about this feature being introduced into the language, hopefully. What does fsync do and what does fflush do? To understand that we have to understand the concept of the different types of buffering, an application runs on a system. So we have the application or sometimes called the user space buffer, and we have the operating system kernel space buffer,

Derick Rethans 2:36

And we're talking about writing to file here right?

David Gebler 2:39

We're talking about writing to files with fflush and fsync yep. So fflush and fsync are both about getting data out to a file, and using it to push something out of the buffers. But there are two different types of buffers; we have the application buffers; we have the operating system buffers. What fflush does is it instructs PHP to flush its own internal buffers of data that's waiting to be written into a file out to the operating system. What it doesn't do is give us any guarantee that the operating system will actually write that data to disk, so the operating system has its own buffer as well. Computers like to be efficient, operating systems like to be efficient, they will save up disk writes and do them in the order they feel like at the time they feel like. What fsync does is it also instructs the operating system to flush its buffers out to disk, thus giving us some kind of better assurance that our data has actually reached disk by the point that function returns.

Derick Rethans 3:29

I really only know about Linux here, but I know on Linux that there's a journal in the file system or, in most of the file systems that it uses. Would have seen make sure the data ends up in a journal, are also committed as how the file system does it itself?

David Gebler 3:45

The manner in which fsync synchronizes is indeed dependent on the particular POSIX type operating system, and file system that it's running on. I think you're right in respect of modern Linux and ext4, that would ensure the journal was also updated and that the data was persisted to disk. Older versions may behave a little bit differently. And one thing to note with fsync is that you can't take it as a cast iron guarantee that the data has either basically been persisted to disk, or that the corresponding file system entries have been updated. It is the best assurance you can get that those things have happened, and by the point that function returns, again, in PHP implementation, you have a solid an assurance as you're going to get. Because that manner of synchronization is dependent on the underlying system. It can vary. Yeah Linux ext4 is probably your best bet with fsync, but one interesting thing to know if it does go into PHP and people start using it, is that when you call fsync on a file, if you want to ensure that the file system entries are properly updated as well, you should also call fsync or a handle to that file's containing directory. And of course on a Linux or POSIX type system you can do that you can you can fopen a directory, and you can then call a sync on that handle.

Derick Rethans 5:02

You mentioned that you can call fsync on a file handle. How's it different than calling just fsync without an argument? Is fsync without an argument just telling the operating system to flush its current buffers, and fsync on a file handle specifically to flush that specific file, and its metadata?

David Gebler 5:22

So in the PHP implementation of fsync, you must supply it with a stream resource that is plain File Stream. You can obviously at the PHP level give it some other kind of resource but it will return a warning if it's not able to convert that into a plain File Stream. Under the hood, when you're talking about C, underlying operating system level, fsync operates on a file descriptor so it doesn't even operate on a stream handle; it operates on the actual underlying number that represents the file at the operating system level.

Derick Rethans 5:51

So that is different from the Unix shell command called fsync.

David Gebler 5:55

It is different. Yep.

Derick Rethans 5:57

The RFC also talks about another function called fdatasync. How's that different from fsync itself?

David Gebler 6:03

I think that was a really good question because it's got two answers. It's got the theoretical answer and the practical answer. The practical answer is that these days, it most likely makes no difference at all. The theoretical answer is the fdatasync only pushed out the actual file data to a disk write, but it wouldn't necessarily update metadata about the file, such as the time it was last modified the data that might be stored in the file system about the file. The reality now is that most operating systems, we'll update both of those things at once. The reason they were separated out in the POSIX specification is because of course updating the metadata is another write, so a call to fsync could require two rights were call to fdatasync would only require one. I'm not aware of any operating system and file system now that I'm going to use that actually still treats them differently

Derick Rethans 6:57

In PHP, we try to make sure that functions that are implemented work exactly the same on operating systems, and you've already explained that fsync depends a little bit on how the file system handles the specific requests. Would fsync also work in a similar way on on Windows or OSX, or is it specifically meant for just Linux?

David Gebler 7:20

It's not specifically meant for just Linux. fsync itself is part of the POSIX specification. Strictly speaking, fsync as an operating system level API does not exist on Windows. Windows does have a similar API mechanism called flush file buffers, and in the RFC, and in the implementation attached to that RFC, on Windows, that's what fsync does, it's a wrapped call to flush file buffers. It has, in practice the same effect. OSX is a bit of a trickier one. fsync does exist on OSX I know. I'm not a user of Apple products myself but I can tell you what I know about OSX. fsync on OSX, it will sort of attempt to flush your file buffers out, but OSX itself will not guarantee that the disk buffers are cached. So we have another layer of buffering there. You have the application space buffering, we have the kernel space buffering, and we have the hardware disk buffering itself. Physical drives will sometimes lie about having written data to disk. I mean USB drives are a notorious example of that. A USB drive will tell your operating system it's finished persisting data when it hasn't, and you can even see that on the little flashing LED on the drive, and if you pull it out your data will be corrupted. OSX is not so reliable. The interface exists, but whether it's actually worked or not is open to question because it may not have flushed the disk cache.

Derick Rethans 8:38

Are there any backward compatibility issues with this RFC or the implementation of this?

David Gebler 8:43

I'm pleased say there are no backwards compatibility issues at all. It's straightforwardly a new function that operates on plain file streams. It triggers a warning if you give it some kind of resource that can't be converted to a plain file descriptor - no consequence to not using it. You just get the same behaviour you've had in previous PHP versions. It's just a new optional function.

Derick Rethans 9:06

What has the feedback to introducing fsync and fdatasync been so far?

David Gebler 9:11

When I originally proposed the RFC, I had a bit of feedback on the internals list and around some of the other aspects of the PHP community that I reached out to in various places. Some people didn't see a need for it, which is fair enough, but my answer to that would be the when we look at the history of PHP as a web first language, you can see why people might not have had much of a use case for something like fsync. I mean it only fits particular situations, which is where you want some kind of transaction or guarantee. Quite often what people were doing with PHP were web applications where there was a degree of volatility in file rights, that wasn't important to them; they didn't particularly care about it. What I would say now is that when we look at PHP eight, and the evolution of the PHP ecosystem. You're seeing that it is, it is such a versatile and performance general purpose programming language these days, that people are using it for all manner of tasks. Using it in micro service architectures for back end services, they're using it even for things like data science and machine learning, emerging industries like that, so it's got so many more applications now. Broadly the feedback I had was, for the most part, probably wouldn't take much more than a pull request for it to be accepted. I think it's a very non controversial RFC. It's a small feature to add in the form of a new function.

Derick Rethans 10:33

And that is very different than Introducing enumerations or Fibers, of course.

David Gebler 10:38

Both of which I'm looking forward to.

Derick Rethans 10:40

We spoke a little bit about file system buffers, once you do a write it up in the application buffers, with flush, you can flush that to the operating system buffers and fsync to the file system. But in PHP development, there's also other buffers, that if you output something to the screen on the command line that ends up directly on the terminal. When you do this when PHP runs in a web server environment it doesn't do that because there's more buffers in between right. There's PHP's own buffers, there's buffers you can configure, there is then web server buffers, and networking buffers. So it's buffers all the way down pretty much, isn't it? How does the buffering in PHP's output work, and what kind of things can you do with that?

David Gebler 11:21

When we look at buffering in PHP, this is very easy to get confused, because he says buffers all the way down. On one hand we may be talking about buffers at the file system level and application space and kernel space, and on the other side we're gonna be talking about these kinds of buffers that you've just mentioned. Again, this is something that PHP does primarily for performance and perhaps for a few other purposes in how it manages the application that is running. Buffering is a way of storing output, somewhere before it gets sent on to the web server and ultimately from the web server to a user's browser. As you say a web server has its own buffering as well and PHP provides a couple of functions by which you can also attempt to force output to the browser. So again we have much as we have fflush for files. We have flush for regular output that you're trying to send to a web server. And that function will flush the internal output buffers of PHP, and it will then try and flush your web server buffers. There's an interesting parallel here because much like with fsync, flush versus fsync, you can't necessarily guarantee with fflush what the operating system will do. Your data wanted received it into its own buffers. With PHP you can't necessarily get a guarantee that a web server will flush its own buffers when you flush your output there. Perhaps we need to invent some kind of fsync for the web server as well. Output buffering is something that you can configure in PHP and you can stack and nest output buffers as well, so that means from an application developer point of view, you are using some other components, some other bit of code in your PHP system that produces its own output. When I say output, I mean via the normal mechanisms you would write something to a browser in PHP, so things like echo at the simplest level. What you can do is you can use PHP's output buffer functions, which are all the ones that start with ob then an underscore, to capture that output and control what you do with it, instead of it being output to the browser; you can capture it into a string, you can manipulate it, you can discard it, you can throw it up the chain to be output yourself. That's what we would primarily use those buffers for in PHP, but output buffering is also something you can configure in the php.ini file. You can turn it off, you can set the buffer size. Do you have a little bit of fine tuning there that you can do.

Derick Rethans 13:39

Something of that just popped in my mind is that when you call the new fsync on a file resource handler, file resource in PHP are implemented with an underlying interface because streams. Is there another fall writing buffer in streams itself as well?

David Gebler 13:58

There is a buffer in the actual C library level when it comes to streams. That's going into the detail a little bit of what I was talking about earlier where you have user space buffers and kernel space buffers. The reason those things exist is to do with the way an operating system manages a computer and keeps everything safe. User space isn't able to access kernel space. It's about range of memory addresses in the computer; we're getting quite low level now in terms of how all this works. In the underlying implementation in PHP source code, we're using the C File Stream functions to write the streams, and that means that the actual data gets copied into a buffer in userspace. What fsync does is it instructs the kernel to make a copy of that data in its own memory space, and then push that out to disk. It's getting quite low low level when we get into those details, it has to do with how file access is managed in PHP. There is an even lower level of access, which is more suitable for very high throughput intensive IO operations, which isn't available in PHP at all. I'm not so familiar with how this works on Linux because the file systems are different, but I can tell you in Windows, if you if you want really intensive throughput, you don't want to be calling fsync or equivalent flush file buffers, you know, hundreds of 1000s of times per second. You want to use what we call unbuffered output, where you write directly at the block level to disk. It wouldn't be suitable to try and do something like that in PHP, because it's to higher level language. It's a very very fine level of control that you need to be able to do that kind of thing. But with that level of control you also have a high level of consequence if something goes wrong.

Derick Rethans 15:39

I think there's actually an extension in PECL, or there used to be one at least, I'm not sure but it's still maintained for newer PHP versions, called dio or d.i.o., standing for direct IO, which I think implements some of these features, but I don't quite remember but I thought that was file system related, or just only network related direct IO.

David Gebler 16:00

I think direct IO extension did have lower level file system access off the top of my head. It's been a long time since I looked at it, and I think it actually had the option to open a file in synchronized mode, which means every write is essentially implicitly calling fsync. What it didn't have was the ability to open file writes in unbuffered mode, which is probably because that's lower level, still I mean that requires you to literally know the block size of the file system you're writing to and to write in that, in that size,

Derick Rethans 16:30

Doing that from PHP goes a little bit too far, I suppose.

David Gebler 16:34

It does go too far, I think.

Derick Rethans 16:36

Then again, if you can open a file you can open a file device as long as you have permission. So, I guess, nothing stops you from at least on Linux opening /dev/sda whatever number it is, as long as you're running PHP as root and writes to it, but I don't think this is a wise thing to do.

David Gebler 16:52

Definitely something I'd urge people to try with caution. I mean, obviously on Linux you you do have this kind of extraordinary power from the combination of the user you're running a process as, coupled with the fact that Linux treats everything as a file, and then actually has some interesting implications for fsync, you can try compiling the branch that I've submitted in the PR for the RFC; compile it on Linux, call fsync on some different handles to things which aren't actually files, but which to the underlying Linux operating system look like files. Hopefully the implementation I've provided is robust enough that it should just return false when you try and fsync things that are not actually files.

Derick Rethans 17:28

What kind of things are you thinking of here, things like directories?

David Gebler 17:32

Directories are fine for fsync and you should actually be able to get a successful fsync on directories because they are literally part of the file system. It's fine on POSIX type systems to do that. I'm not actually sure whether you can do that on Windows or not, but I don't think it provides any particular benefit to fsync a directory on Windows because the underlying flash file buffers API works on the file level only, but on Linux you could try opening file handles to something like a Unix socket and try and fsync that. You should just get false in PHP land from that function because it knows that it can't convert that file handle into an actual file stream, and thus can't get a regular fsync on it.

Derick Rethans 18:12

Have you created a test case for that situation?

David Gebler 18:15

I have created some test cases for opening streams to things that are not files from PHP. I can't whether I've covered that particular one or not, but I have got a couple in there for things which are not files at the PHP level.

Derick Rethans 18:29

That's good to hear. When do you think he will be putting this up for a vote?

David Gebler 18:34

I'm planning to put this up for a vote later this week actually. The RFC has been open for a little while, I think it's been about three weeks since I announced it on internals, and obviously there hasn't been a huge amount by way of feedback or discussion on it lately. That doesn't particularly surprise me, it is not a particularly controversial thing to add. I think a lot of people probably don't have much feeling about it one way or the other. But then, I'm hopeful, people who vote will see that as a reason to include it.

Derick Rethans 19:00

Thank you David for explaining the fsync RFC and related topics to me today.

David Gebler 19:05

Well, thanks for having me on. It's been great talking to you. And of course to anyone listening who's interested in the RFC, do check it out on the PHP wiki. Do have a look at the discussion on internals. And if you're a voting member, don't forget to vote when I open it up to vote later this week.

Derick Rethans 19:21

Thank you for listening to this instalment of PHP internals news, a podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool, you can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next time.

Show Notes


Interview with Michael Akopov

Join us for an interview with Michael Akopov, another returning guest to the podcast.

Topics Covered

  • Working with Laravel, Vapor, Forge, and Livewire.
  • His new book, Beyond Laravel which offers an overview of the ecosystem around the framework.
  • His six rules to keep in mind when building a business, product, or service.
  • Using the community and services around Laravel for your next project or product idea.
  • Remote working when everyone else to all remote.
  • Ruby versus PHP for web applications.

The post Interview with Michael Akopov appeared first on php[architect].


Rolling up, building static sites, and user wants

Jake and Michael discuss rolling up assets in legacy projects, building static sites with Vite and Statamic, and realising the hopes and dreams of your applications' users into actual development work.

This episode is sponsored by Workvivo - the employee communication platform to excite, engage and connect your entire company - and was streamed live.

Show links

224:The UUID Life

I've got to be honest with you, this is a ROUGH CUT of a show. Had issues with recording, the issue is I forgot to start the recording, so I had to grab the audio from the stream. So there are barking dogs and all sorts of issues in the show.

This week on the podcast, Eric, John, and Thomas talk about STILL MORE Crypto Talk, NERDTree in PHPStorm, enums in PHP, and much more

Links from the show:

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.

Twitter Account https://twitter.com/phpugly

Host:

Streams:

Powered by Restream

Patreon Page

PHPUgly Anthem by Harry Mack / Harry Mack Youtube Channel



PHP Internals News: Episode 76: Deprecate null, and Array Unpacking

PHP Internals News: Episode 76: Deprecate null, and Array Unpacking

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about two RFCs: Deprecate passing null to non-nullable arguments of internal functions, and Array Unpacking with String Keys.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:14

Hi I'm Derick. Welcome to PHP internals news, a podcast dedicated to explain the latest developments in the PHP language. This is Episode 76. In this episode, I'm talking with Nikita Popov about a few more RFCs that he has been working on over the past few months. Nikita, would you please introduce yourself.

Nikita Popov 0:34

Hi, I'm Nikita. I work on PHP core development on behalf of JetBrains.

Derick Rethans 0:39

In the last few PHP releases PHP is handling of types with regards to internal functions and user land functions, has been getting closer and closer, especially with types now. But there's still one case where type mismatches behave differently between internal and user land functions. What is this outstanding difference?

Nikita Popov 0:59

Since PHP 8.0 on the remaining difference is the handling of now. So PHP 7.0 introduced scalar types for user functions. But scalar types already existed for internal functions at that time. Unfortunately, or maybe like pragmatically, we ended up with slightly different behaviour in both cases. The difference is that user functions, don't accept null, unless you explicitly allow it using nullable type or using a null default value. So this is the case for all user types, regardless of where or how they occur as parameter types, return values, property types, and independent if it's an array type or integer type. For internal functions, there is this one exception where if you have a scalar type like Boolean, integer, float, or a string, and you're not using strict types, then these arguments also accept null values silently right now. So if you have a string argument and you pass null to it, then it will simply be converted into an empty string, or for integers into zero value. At least I assume that the reason why we're here is that the internal function behaviour existed for a long time, and the use of that behaviour was chosen to be consistent with the general behaviour of other types at the time. If you have an array type, it also doesn't accept now and just convert it to an empty array or something silly like that. So now we are left with this inconsistency.

Derick Rethans 2:31

Is it also not possible for extensions to check whether null was passed, and then do a different behaviour like picking a default value?

Nikita Popov 2:40

That's right, but that's a different case. The one I'm talking about is where you have a type like string, while the one you have in mind is where you effectively have a type like string or null.

Derick Rethans 2:51

Okay.

Nikita Popov 2:52

In that case, of course, accepting null is perfectly fine.

Derick Rethans 2:56

Even though it might actually end up being different defaults.

Nikita Popov 3:01

Yeah. Nowadays we would prefer to instead, actually specify a default value. Instead of using null, but using mull as a default and then assigning something else is also fine.

Derick Rethans 3:13

What are you proposing to change here, or what are you trying to propose to change that into?

Nikita Popov 3:18

To make the behaviour of user land and internal functions match, which means that internal functions will no longer accept null for scalar arguments. For now it's just a deprecation in PHP 8.1, and then of course later on that's going to become a type error.

Derick Rethans 3:35

Have you checked, how many open source projects are going to have an issue with this?

Nikita Popov 3:40

No, I haven't. Because it's not really possible to determine this using static analysis, or at least not robustly because usually null will be a runtime value. No one does this like intentionally calling strlen with a null argument, so it's like hard to detect this just through code analysis. I do think that this is actually a fairly high impact change. I remember that when PHP 7.2, I think, introduced to a warning for passing null to count(). That actually affected quite a bit of code, including things like Laravel for example. I do expect that similar things could happen here again so against have like strlen of null is pretty similar to count of null, but yeah that's why it's deprecation for now. So, it should be easy to at least see all the cases where it occurs and find out what should be fixed.

Derick Rethans 4:35

What is the time frame of actually making this a type error?

Nikita Popov 4:38

Unless it turns out that this has a larger impact than expected. Just going to be the next major version as usual so PHP 9.

Derick Rethans 4:45

Which we expect to be about five years from now.

Nikita Popov 4:49

Something like that, at least if we follow the usual cycle.

Derick Rethans 4:52

Yes. Are there any other concerns for this one?

Nikita Popov 4:55

No, not really.

Derick Rethans 4:57

Maybe people don't realize it.

Nikita Popov 4:58

Yeah, possibly. You can't predict these things, I mean like, this is going to have like way more practical impact for legacy code than the damn short tags. But for short tags, we get 200 mails and here we get not a lot.

Derick Rethans 5:14

I think this low impact WordPress a lot.

Nikita Popov 5:17

Possibly but at least the thing they've been complaining about is that something throws error without deprecation, and now they're getting the deprecation so everyone should be happy.

Derick Rethans 5:28

Which is to be fair I think is a valid concern.

Nikita Popov 5:30

Yes, it is. I've actually been thinking if we should like backport some deprecations to PHP 7.4 under an INI flag. Not like my favourite thing to work on, but people did complain?

Derick Rethans 5:47

Which ones would you put in there?

Nikita Popov 5:48

I think generally some cases where things went from no diagnostics to error. I think something that's mentioned this vprintf and round, and possibly the changes to comparison semantics. I did have a patch that like throws a deprecation warning, when that changes and that sort of something that could be included.

Derick Rethans 6:12

I would say that if we were in January 2020 here, when these things popped up, then probably would have made sense to add these warnings and deprecations behind the flag for PHP seven four, but because we've now have done 15 releases of it, I'm not sure how useful this is now to do.

Nikita Popov 6:30

I guess people are going to be upgrading for a long time still. I don't know I actually not sure about how, like distros, for example Ubuntu LTS update PHP seven four. If they actually follow the patch releases, because if they don't, then this is just going to be useless.

Derick Rethans 6:48

Oh there's that. Yeah.

Derick Rethans 6:50

There is one more RFC that I would like to talk to you about, which is the array unpacking with string keys RFC. That's quite a mouthful. What does the background story here?

Nikita Popov 7:00

The background is that we have unpacking in calls. If you have the arguments for the call in an array, then you write the three dots, and the array is unpacked into actual arguments.

Derick Rethans 7:14

I'd love to call it the splat operator.

Nikita Popov 7:16

Yes, it is also lovingly called the splat operator. And I think it has a couple more names. So then, PHP 7.4 added the same support in arrays, in which case it means that you effectively merge, one array to the other one. Both call unpacking and array unpacking, at the time, we're limited to only integer keys, because in that case, are the semantics are fairly clear. We just ignore the keys, and we treat the values as a list. Now with PHP 8.0 for calls, we also support string keys and the meaning there is that the string keys are treated as parameter names. That's how you can like do a dynamic named parameter call. Actually, this probably was one of the larger backwards compatibility breaks in PHP eight. Not for unpacking but for call_user_func_arg, where people expected the keys to be thrown away, and suddenly they had a meaning, but that's just a side note.

Derick Rethans 8:21

It broke some of my code.

Nikita Popov 8:23

Now what this RFC is about is to do same change for array unpacking. So allow you to also use string keys. This is where originally, there was a bit of disagreement about semantics, because there are multiple ways in which you can merge arrays in PHP, because PHP has this weird hybrid structure where arrays are a mix between dictionaries and lists, and you're never quite sure how you should interpret them.

Derick Rethans 8:54

It's a difference between array_merge and plus, but which way around, I can ever remember either.

Nikita Popov 9:00

What array_merge does is for integer keys, it ignores the keys and just appends the elements and for string keys, it overwrites the string keys. So if you have the same string key one time earlier and again later than it takes the later one. Plus always preserves keys, before integer keys. It doesn't just ignore them, but also uses overriding semantics. The same is the other way round. If you have something in the first array, a key in the first array and the key in the second array, then we take the one from the first array, which I personally find fairly confusing and unintuitive, so for example the common use case for using plus is having an array with some defaults, in which case you have to, like, add or plus the default as the second operand, otherwise you're going to overwrite keys that are set with the defaults which you don't want. I don't know why PHP chose this order, probably there is some kind of idea behind it.

Derick Rethans 10:01

It's behaviour that's been there for 20 plus years that might just have organically grown into what it is.

Nikita Popov 10:07

I would hope that 20 years ago at least someone thought about this. But okay, it is what it is. So ultimately choice for the unpacking with string keys is between using the array_merge behaviour, the behaviour of the plus operator, and the third option is to just always ignore the keys and always just append the values. And some people actually argue that we should do the last one, because we already have array_merge and plus for the other behaviours. So this one should implement the one behaviour that we don't support yet.

Derick Rethans 10:40

But that would mean throwing away keys.

Nikita Popov 10:43

Yes. Just like we already throw away integer keys, so it's like not completely out there. So yeah, that is not the popular option, I mean if you want to throw away keys can just call array_values and go that way. So in the end, the semantics it uses is array_merge

Derick Rethans 10:58

The array_merge semantics are..

Nikita Popov 11:01

append, like ignore integer keys just append, and for string keys, use the last occurrence of the key.

Derick Rethans 11:07

So it overwrites.

Nikita Popov 11:08

It overwrites, exactly. Which is actually also the semantics you get if you just write out an array literal where the same key occurs multiple times. Unpacking is like kind of a programmatic way to write a function call or an array literal, so it makes sense that the semantics are consistent.

Derick Rethans 11:26

I think I agree with that actually, yes. Are there any changes that could break existing code here?

Nikita Popov 11:32

Not really because right now we're throwing an exception if you have string keys in array unpacking. So it could only break if you're like explicitly catching that exception and doing something with it, which is not something where we provide any guarantees I think. So generally I think that, removing an exception doesn't count as a backwards compatibility break.

Derick Rethans 11:55

I think that's right. Do you have anything else to add here?

Nikita Popov 11:59

No, I think that's a simple proposal.

Derick Rethans 12:02

Thank you, Nikita for taking the time to explain these several RFCs to me today.

Nikita Popov 12:07

Thanks for having me Derick.

Derick Rethans 12:11

Thank you for listening to this instalment of PHP internals news, a podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next time.


223:I am Human

This week on the podcast, Eric, John, and Thomas talk about more Crypto, Array Unpacking, Tinker in Vim, and more...

Links from the show:

Twitter Account https://twitter.com/phpugly

Host:

Streams:

Powered by Restream

Patreon Page

PHPUgly Anthem by Harry Mack / Harry Mack Youtube Channel