Reminiscing, testing validation, and (not) paddlin’ your database

In this, our 100th episode, Jake and Michael reminisce over the past five years of the show, discuss having (and then not having) kids, testing form validation, testing around third-party boundaries, and solving scaling issues on thenping.me.

This episode is sponsored by Makeable.dk and Workvivo.

Show links



Interview with Ken Marks

Eric van Johnson and John Congdon talk to Ken Marks about his article in the July issue, Mentoring and Teaching PHP and his new book PHP Web Development with MySQL

Topics Covered

  • How he got stated writing and why he wrote a book.
  • How he teaches students to build web applications with PHP and MySQL.
  • Getting started in teaching PHP.
  • Becoming part of his local web development community.
  • Staying motivated as a student or intern.

The post Interview with Ken Marks appeared first on php[architect].


247:Just Google ….. oh no don’t, I was wrong

This week on the podcast, Eric, John, and Thomas talk about Livewire, Vim, Generators, Constructor property promotion, and more...

Links from the show:

This episode of PHPUgly was sponsored by:

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.

Twitter Account https://twitter.com/phpugly

Host:

Streams:

Powered by Restream

Patreon Page

PHPUgly Anthem by Harry Mack / Harry Mack Youtube Channel

Thanks to all of our Patreon Sponsors:

Honeybadger ** This weeks Sponsor **
ButteryCrumpet
Shawn
David Q
Ken F
Tony L
Frank W
Jeff K
Shelby C
S Ferguson
Boštjan O
Matt L
Dmitri G
Knut E B
Marcus
MikePageDev
Rodrigo C
Billy
Darryl H
Blaž O
Mike W
Holly S
Peter A
Ben R
Luciano N
Elgimbo
Wayne
Kevin Y
Alex B
Clayton S
Kenrick B
Kalen J
R. C. S.
ahinkle
dreamup
Enno R
Sevi
Maciej P
Jeroen F
Ronny M N
Chris C
Tristan I


Elasticsearch, Teaching PHP, Design Patterns, People, Joe Watkins, and more

Listen to Eric, John, and Oscar discuss the articles in the July 2021 issue, Deep Dive into Search.

Topics Covered

  • Using Elasticsearch in an application.
  • Archery (for some reason).
  • Password complexity and entropy.
  • Why you should use a password manager.
  • Eric’s interview with Joe Watkins about Bus Factors.
  • Teaching and mentoring new developers.
  • Rocky Linux, a CentOS alternative.
  • Soylent Green, Stakeholders, and Requirements.
  • When to use the Decorator Pattern.
  • Returning to a new normal.

The post Elasticsearch, Teaching PHP, Design Patterns, People, Joe Watkins, and more appeared first on php[architect].



246:Spilled Beans

Links from the show:

This episode of PHPUgly was sponsored by:

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.

Twitter Account https://twitter.com/phpugly

Host:

Streams:

Powered by Restream

Patreon Page

PHPUgly Anthem by Harry Mack / Harry Mack Youtube Channel

Thanks to all of our Patreon Sponsors:

Honeybadger ** This weeks Sponsor **
ButteryCrumpet
Shawn
David Q
Ken F
Tony L
Frank W
Jeff K
Shelby C
S Ferguson
Boštjan O
Matt L
Dmitri G
Knut E B
Marcus
MikePageDev
Rodrigo C
Billy
Darryl H
Blaž O
Mike W
Holly S
Peter A
Ben R
Luciano N
Elgimbo
Wayne
Kevin Y
Alex B
Clayton S
Kenrick B
Kalen J
R. C. S.
ahinkle
dreamup
Enno R
Sevi
Maciej P
Jeroen F
Ronny M N
Chris C ** New Patreon Member!! **
Tristan I ** New Patreon Member!! **


PHP Internals News: Episode 92: First Class Callable Syntax

PHP Internals News: Episode 92: First Class Callable Syntax

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about the "First Class Callable Syntax" RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:14

Hi, I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is Episode 92. Today I'm talking with Nikita Popov about a first class callable syntax RFC that he's proposing together with Joe Watkins. Nikita, would you please introduce yourself?

Nikita Popov 0:36

Hi, Derick. I'm Nikita and I am still working at JetBrains. And still working on PHP core development.

Derick Rethans 0:43

Just like about half an hour ago when we recorded an earlier episode.

Nikita Popov 0:47

Exactly.

Derick Rethans 0:48

This RFC has no relation to read only properties. What is the first class callable syntax RFC about?

Nikita Popov 0:55

The context here is that PHP has the callable syntax based on literals, which is that if you just use a plain string, it's interpreted as a function name, and an array where the first element is an object, and the second one is a method name, that's methods. Or the first element is the class name, and the second one is method name, that's a static method.

Derick Rethans 1:17

I would consider this concept a bit of a hack, especially the the one with the arrays, and I reckon you feel similar and hence this RFC?

Nikita Popov 1:27

Yes, I do. So the current callable syntax has a couple of issues. I think the core issue is that it's not really analysable. So if you see this kind of like array with two strings inside it, it could just be an array with two strings, you don't know if that's supposed to actually be a static method reference. If you look at the context of where it is used, you might be able to figure out that actually, this is a callable. And like in your IDE, if you rename this method, then this array should also be this array element will also be renamed. But there's like a lot of complex reasoning that the static analyser has to perform. That's one side of the issue. The second one is that callables are not scope independent. For example, if you have a private method, then like at the point where you create your callable, like as an array, it might be callable there, but then you pass it to some other function. And that's in a different scope. And suddenly that method is not callable there. So this is a general issue with both like this callable syntax based on arrays, and also the callable type. It's a callable at exactly this point, not callable at a later point. This is what the new syntax essentially addresses. So it provides a syntax that like clearly indicates that yes, this really is a callable, and it performs the callable callability check at the point where it's created, and also binds the scope at that time. So if you pass it to a different function in a different scope, it still remains callable.

Derick Rethans 3:01

And it's guaranteed to always be callable.

Nikita Popov 3:03

Yeah, exactly.

Derick Rethans 3:04

What does the syntax like?

Nikita Popov 3:06

The syntax is the funny bit. As a bit of context. This proposal was created as an alternative or as a subset of the partial function application RFC.

Derick Rethans 3:17

That is just as hard to pronounce as first class callable syntax RFC.

Nikita Popov 3:21

Yes, that's why we say PFA. The PFA RFC has a more general feature. It also allows you to create a reference to a callable as a side effect. But more generally, it allows you to also bind some of the arguments to a fixed value. And has like finer control over for example, you can create a callable that has three required parameters, by passing three question mark arguments. While the new syntax only allows you to use the signature of the original function. But the syntax between both of those is compatible. So the new RFC is a subset of PFA. And that's why it uses the syntax where you do a normal function call, but then pass three dots or an ellipsis as arguments.

Derick Rethans 4:08

Instead of passing the function's or method's normal arguments, you use the three dots.

Nikita Popov 4:14

I think like the way to think about the syntax is that this is similar to like a variadic argument, or to the argument unpacking syntax, just that the arguments haven't yet been provided, they will be provided during the actual call. But I think the syntax was definitely the most contentious bit in the discussion of the RFC. I think this is mainly related to the fact that if you the see this code snippet, it looks a bit like, like the example code where the arguments haven't been filled in. While now this is like actual syntax.

Derick Rethans 4:44

I'm sure there's quite a few tutorials out there explaining how PHP works by using dot dot dot. That is not something you can avoid.

Nikita Popov 4:54

Well, we can avoid it, but it's fairly tricky question. I mean, the reason for this dot dot dot syntax, on one hand, this the compatibility with partial functions. I mean, the PFA, RFC has recently been declined. But in the future, we could extend the current syntax to full partial functions. And we would not end up with two different ways. So that's one benefit of the syntax. But the other part is that PHP has different symbol tables for different kinds of symbols. People often ask, why can't you just write like strlen as a plain name, not inside a string, and have that be treated as a reference to this function? And the answer to that is that we can't do that because you can't have a constant that's called strlen. Normally, that would be reference to constant and the same actually applies to all other callable types as well. So if you have something like methods, like object or method name, that would right now be interpreted as a property access. And for static methods, it will be interpreted as a as a class constant access. So we have this ambiguity here. Even if we add an additional symbol to this, for example, like for classes, we have the syntax, class name, and then scope operator class, that gives you the class name. We could do something like strlen, scope operator function, or fn, or whatever, and have that return the callable. That would work, but it also has some ambiguities. For example, if you have something like object, arrow methods, and then scope operator fn, you have this ambiguity. Is this referencing the method of that name? Or is it referencing a callable stored inside the property of that name? This is like fundamentally ambiguous. The way we would resolve it is we will just say that this index is only usable with real simple, so it will always refer to a method, and you couldn't use the syntax to convert the callable stored in a property into a proper callable. I'm actually not sure how I should distinguish these two concepts, because we have the existing callable, strings and arrays, and the first class callables, which are really closure objects.

Derick Rethans 7:11

Which actually sort of brings me to the next question which just popped in my head, which is: Does this first class scalable syntax, what is returned as return a closure or an existing callable type as we have now, with a callable type being a single string, or this array syntax that we now use.

Nikita Popov 7:28

The syntax returns a closure. Actually, the syntax works essentially the same way as the closureFromCallable method. And we do need to return a closure otherwise, we don't get this behaviour where the scope is bound at the time where the callable is created, rather than called. I think maybe going forward, I would generally recommend that people use a closure type, instead of a callable type in type declarations. I mean, you already cannot use callable for property types. Exactly due to this problem that callability is context dependent. While we only forbid it in property types, the same general problem also exists for argument and return types. And especially with the new syntax being introduced here, I think it's best to use closure instead of callable in the future.

Derick Rethans 8:18

Does that sort of mean that first class scalable syntax is syntactic sugar? Or does it do more than the closureFromCallable method?

Nikita Popov 8:27

No, I think it's effectively just syntactic sugar for closureFromCallable.

Derick Rethans 8:34

I'm actually not sure whether Xdebug is be able to do anything with these closure from callable things to begin with. So that is something I'm going to have to investigate.

Nikita Popov 8:44

Be able as in like, display that it actually refers to a specific method rather than just some kind of closure?

Derick Rethans 8:51

Yeah, because at the moment, it shows you the file name and the line numbers, it doesn't have a name right if you create normal closures, but in this case, it's important to know that it actually refers to specific methods, which is the same thing as the closureFromCallable syntax would also do, but I've never done anything with that.

Nikita Popov 9:10

But I think there is a way to get like the underlying prototype for the closure, and you should be able to determine it from there.

Derick Rethans 9:18

The first class callable syntax, are there situations where you can't use it?

Nikita Popov 9:22

One place where you don't want to use the new syntax as if you don't want to actually create a closure object, and validate callability at the point of creation. For example, creating this first class callable also implies that you have to autoload the class for a static method. If you have some kind of like large definition of of handler, of static handler methods for routes or something like that, then using the first class callable syntax would imply that you have to immediately create closure objects for all of these and immediately load all those classes. That's a use case where you might want to stick with the old syntax.

Derick Rethans 10:01

But wouldn't opcache resolve that issue really?

Nikita Popov 10:04

No, opcache is really exactly the reason why you wouldn't want to do that. For example, for my fastroute library, I cache all the data as a static array. And that's something that OpCache can cache very efficiently because it's in shared memory and accessing it is essentially zero cost. If you include something like first class callables in it, then those have to always be created at runtime, because we don't have concept like, like a persistent object. That means that this can no longer, I mean, the whole script can be in shared memory, but it still has to be executed always at runtime to construct the whole data structure. And that's going to be less efficient. To give a more clear answer to your question is that the first class callable syntax has a cost when creating the callable, and if you are in a situation where avoiding that cost is really critical for performance, that's why you wouldn't want to use it.

Derick Rethans 11:00

And instead you'd have to use the old scalable syntax that we already have.

Nikita Popov 11:03

Exactly. So for that reason, I think that the old syntax is not going to be removed in the near future at least, though maybe we can deprecate certain aspects of it. For example, the syntax also allows you to do highly context dependent things like referencing self, which is even worse than the situation with a private method, because self could refer to something different every time you call it. Those are some things we might want to deprecate early, but the main syntax itself was probably going to stay for a while.

Derick Rethans 11:34

Because callability is checked when you create the closures does that mean it also checks for strictness then? If your PHP file has been declared with strict types?

Nikita Popov 11:45

Strictness is handled the same as with closureFromCallable.The strictness is still determined at the time where the call is made, not where the callable was created, which actually, I am not a fan of how PHP handles strict types together with dynamic calls. But that's like a pre existing problem. And this isn't touching on this.

Derick Rethans 12:06

The language has many issues that probably could have been done better if it was designed from scratch. But that ship has sailed 26 years ago.

Nikita Popov 12:15

The strict types are not quite that old.

Derick Rethans 12:17

No, that is true. The language itself is of course.

Derick Rethans 12:23

Okay, thank you very much then, for taking the time this morning to talk to me about first class scalable syntax.

Nikita Popov 12:29

Thanks for having me, Derick.

Derick Rethans 12:30

Thank you for listening to this installment of PHP internals news, a podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening and I'll see you next time.


245: Exposing Secrets

Links from the show:

This episode of PHPUgly was sponsored by:

PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.

Twitter Account https://twitter.com/phpugly

Host:

Streams:

Powered by Restream

Patreon Page

PHPUgly Anthem by Harry Mack / Harry Mack Youtube Channel

Thanks to all of our Patreon Sponsors:

Sevi ** New Patreon Member!!!!**
HONEYBADGER.io ** This weeks Sponsor **
Wayne
S Ferguson
Holly S.
Marcus
dreamup
David Q.
Jeff K.
Luciano N.
Clayton S.
Maciej P.
Kenrick B.
Shelby C.
Alex B.
Elgimbo
Peter A.
R. C. S.
Kalen J.
Ken F.
Blaž O.
Matt L.
Billy
Mike W.
Kevin Y.
ahinkle
Darryl H.
MikePageDev
ButteryCrumpet
Ronny MN.
Dmitri G.
Enno R.
Jeroen F.
Shawn
Knut B.
Rodrigo C.
Tony L.
Frank W.
Ben R.
Boštjan O


PHP Internals News: Episode 91: is_literal

PHP Internals News: Episode 91: is_literal

In this episode of "PHP Internals News" I chat with Craig Francis (Twitter, GitHub, Website), and Joe Watkins (Twitter, GitHub, Website) about the "is_literal" RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:14

Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is Episode 91. Today I'm talking with Craig Francis and Joe Watkins, talking about the is_literal RFC that they have been proposing. Craig, would you please introduce yourself?

Craig Francis 0:34

Hi, I'm Craig Francis. I've been a PHP developer for about 20 years, doing code auditing, pentesting, training. And I'm also the co-lead for the Bristol chapter of OWASP, which is the open web application security project.

Derick Rethans 0:48

Very well. And Joe, will you introduce yourself as well, please?

Joe Watkins 0:51

Hi, everyone. I'm Joe, the same Joe from last time.

Derick Rethans 0:56

Well, it's good to have you back, Joe, and welcome to the podcast Craig. Let's dive straight in. What is the problem that this proposal's trying to resolve?

Craig Francis 1:05

So we try to address the problem where injection vulnerabilities are being introduced by developers. When they use libraries incorrectly, we will have people using the libraries, but they still introduce injection vulnerabilities because they use it incorrectly.

Derick Rethans 1:17

What is this RFC proposing?

Craig Francis 1:19

We're providing a function for libraries to easily check that certain strings have been written by the developer. It's an idea developed by Christoph Kern in 2016. There is a link in the video, and the Google using this to prevent injection vulnerabilities in their Java and Go libraries. It works because libraries know how to handle these data safely, typically using parameterised queries, or escaping where appropriate, but they still require certain values to be written by the developer. So for example, when using a query a database, the developer might need to write a complex WHERE clause or maybe they're using functions like datediff, round, if null, although obviously, this function could be used by developers themselves if they want to, but the primary purpose is for the library to check these values.

Derick Rethans 2:05

That is a method of doing it. What is this RFC adding to PHP itself?

Craig Francis 2:09

It just simply provides a function which just returns true or false if the variable is a literal, and that's basically a string that was written by the developer. It's a bit like if you did is_int or is_string, it's just a different way of just sort of saying, has this variable been written by the developer?

Derick Rethans 2:28

Is that basically it?

Craig Francis 2:30

That's it? Yeah.

Joe Watkins 2:32

It would also return true for variables that are the result of concatenation of other variables that would pass the is literal check. Now, this differs from Google, because they introduced that at the language level, but not only at the language level, at the idiom level. So that when you open a file that's got queries in PHP, commonly, if they're long, basic concatenation is used to build the query and format it in the file so that it's readable. So that it wouldn't really be very useful if those queries that you see everywhere in stuff like PHPMyAdmin, and WordPress, and Drupal and just normal code weren't considered literal, just because they're spread over several lines with the concatenation operator. It's strictly not just stuff that's written by the programmer, but also stuff that was written by the programmer or concatenated, with other stuff that was written by the programmer.

Derick Rethans 3:33

Now in the past, we have seen something about adding taint supports to PHP, right? How is this different, or perhaps similar, to taint checking?

Craig Francis 3:44

At the moment today, there is a taint extension, which is something you need to go out your way to install, and actually learn about and how to use. But the main difference is that taint checking goes on the basis of say, this variable is safe or unsafe. And the problem is that it considers anything that had been through an escaping function like html_entities as safe. But of course, the problem is that escaping is difficult. And it's very easy to make mistakes with that. A classic example is if you take a value from a user, an SSH SSH, their homepage URL, if you use HTML encoding, and then put it into the href attribute of a link, that can also result in HTML injection vulnerability, because the escaping is not aware of the context which is used. Because if the evil user put in a JavaScript URL, that is in inline JavaScript, that has created a problem because taint checking would assume that because you use HTML encoding it is safe, and all I'm saying is that is it creates a false sense of security. And by stripping out all that support for escaping, it means that you can focus on libraries doing that work because they know the context, they understand the domain, and we can just keep it a much simpler, and much safer approach.

Derick Rethans 5:02

Would you say that the is_literal feature is mostly aimed at library authors and not individual developers?

Craig Francis 5:09

Yeah, exactly. Because the library authors know what they're doing. They're using well tested code, many eyes over it. The problem libraries have at the moment is that they trust the developer to write things themselves. And unfortunately, developers introduce a lot of injection vulnerabilities with those strings before they even get into the library.

Derick Rethans 5:30

How would a library deal with with strings that aren't literal then?

Craig Francis 5:35

So it really depends on each individual example. And the RFC does include quite a lot of examples of how each one will be dealt with. The classic one is, let's say you're sorting by a column in a database, because if we're dealing with SQL, the field name might come from the user. But that is also quite a risky thing to do if you start including whatever field name the user wrote. So in the RFC, I've created a very simple example where the developer would create an array of fields that you can sort by, and then whatever the user provides, you search through that array, and you pull out the one that you that matches and is fine. And therefore you are pulling out a literal and including into the SQL. To be fair, these ones are quite unique. And each one needs to be dealt with in its own way. But I've yet to find an example where you can't do it with a literal. Having said that, I think Larry Garfield actually gave an example where a content management system changed its database structure. And the way that would work is the library would have to deal with it, they would receive the value for a field, and then that field would be escaped and treated as a field, it understands it as a field, and it will process it as such, then it can include into the SQL, knowing full well that everything else in that SQL is a literal, and then it can just build up SQL in its own way internally.

Derick Rethans 6:58

Okay, talking a little bit about the implementation here. Since PHP seven, we have this concept of interned strings, or maybe even before that actually, I don't quite remember. Which is pretty much a flag on each string and PHP that says, this's been created by the engine, or by coconut. Why would strings have to have an extra flag here to remember that it is created by the programmer?

Joe Watkins 7:21

Well, interned does not mean literal. It's an optimization in the engine, should we use strings. We're free to do whatever we want with that. At the moment, it by happenstance, most interned strings are those written by the programmer. If you think about the sort of strings that are written by the programmer, like a class name, when those things are declared internally, by an extension, or by core code, those things are interned as if they were written by the programmer. They don't mean literal, we're free to use interned strings for whatever we want. For example, a while ago, someone suggested that we should intern keys while JSON decoding or unserializing. It didn't happen, but it could happen. And then we'd have the problem of, well, how do we separate out all this other input. There is another optimization attached to interned strings, which is one character strings, where if you type only one character, or you call a Class A or B, or whatever, the permanent interned string will be used. That results in when the chr function is called, that results in the return of that function always being marked as interned. So it would show as literal, which is not a very nice side effect. And that's just a side effect that we can see today. We don't want to reuse the string really, it does need to be distinct. Also, if you're going to concatenate, whether you do it with the VM or a specific function, obviously, you need to be able to distinguish between an interned string and a literal string, which interned means it has a specific life cycle and specific value. And we can't break that.

Derick Rethans 9:00

So there are really two different concepts, is what you're saying, and hence, they need to have a special flag for that?

Joe Watkins 9:06

Yeah, they're very, two very separate concepts. And we don't we don't want to restrict the future of what interned strings may be used for. We don't want to muddy the concept of a literal.

Derick Rethans 9:16

Of course, any sort of mechanism that languages built into solve or prevent injections in any sort of form, there's always ways around it. Theoretically, how would you go around the is_literal checks to still get a user inputted value into something that passes the is_literal check?

Craig Francis 9:36

Generally speaking, you would never need it because the library should know how to deal with every scenario anyway. And it's not that difficult. We're only talking about things like in the database world, you'll be taking value from field names and therefore it should receive field names or table names. And, you know, we are providing a guardrail as a safety net. And what should happen is that the default way in which programmers work should guide them, to do it the right way. We're not saying that you can't do weird things to intentionally work around this. A really ugly version, which you should never do, but use eval and var_export together, it's horrible. But if you are so desperate, you need to get around this. That's what we're doing it. But in reality, we can't find any examples where you'd actually need to do this.

Joe Watkins 10:22

I would say that, hey, there's this idea that most people writing PHP are using libraries, and they're using frameworks. I don't actually find that to be true. I've been working in PHP for a long time. And most of the big projects I've worked on for a long time did not start out using frameworks. And they did not start out using libraries. They look a bit like that today, but their core, they are custom. There may be a framework buried in there. But there is so much code that the framework is a component and is not the main deal. Most code, we actually do write ourselves, because that's what we're paid to do. I think we don't decide how people are going to use it, and we don't decide where they're going to use it. The fact is, like Craig said, it's a guardrail that you can work around easily. And if you find a use case for doing that, then we shouldn't prejudge, and say, well, that's the wrong thing to do. It might not be the wrong thing to do. For example, an earlier version of the idea included support for integers. We considered integers safe, regardless of their source. If you wanted to do that, in your application, you could do that very easily and still retain the integrity of the guardrail is not compromised. I wouldn't focus on this is for libraries, and this is for frameworks, because these things become so small in the scheme of things that they're meaningless. I mean, most of the code we work on is code that we wrote, it is not frameworks.

Derick Rethans 11:48

That also nicely answers my next question, which is what's happened to integers, which have now nicely covered. The RFC talks about that as hard to educate people to do the right thing. And that is_literal is more focused, so to say, on libraries, and perhaps query building frameworks as the RFC alludes to. But I would say that most of these query building tools or libraries already deal with escaping from input value. So why would it make sense for them to start using is_literal if you're handling most of these cases already anyway?

Craig Francis 12:24

If you look at the intro of the RFC, there's a link to show examples of how libraries currently receive the strings. And you're right about the Query Builder approach is a risky thing, I would still argue it's an important part. That's why libraries still provide them. Doctrine has a nice example of DQL. The doctrine query language is an abstraction that they've created, which is also vulnerable to injection vulnerabilities. And it gives the developer a lot more control over a very basic API. I still think people should try and use the higher level API's because they do provide a nice safe default, but that depends on which library use, they're not always safe by default. So for example, when you're sort of saying: I want to find all records where field parameter one, is equal to value two, a lot of the libraries assumed that the first parameter there is safe and written by the developer. They can't just necessarily simply escape it as though it's a field because that value might be something like date, bracket, field, bracket, and it's sort of relying on the developer to write that correctly, and not make any mistakes. And that hasn't proven to be the case, you know, they do include user values in there.

Derick Rethans 13:43

Just going back a little bit about some of the feedback, because feedback to the RFC has happened for quite some time now. And there were lots of different approaches first tried as well, and suggested to add additional functions and stuff like that. So what's been the major pushback to this latest iteration of the RFC?

Joe Watkins 14:01

So I think the most pushback has come from an earlier suggestion that we could allow integers to be concatenated and considered literal. We experimented with that, and it is possible, but in order to make it possible, you have to disable an optimization in the engine, that would not be an acceptable implementation detail for Dmitri. It turns out we didn't actually, we don't need to track their source technically, but it made people extremely uncomfortable when we said that, and even when we got an independent security expert to comment on the RFC, and he tried to explain that it was no problem, but it was just not accepted by the general public. I'm not sure why.

Derick Rethans 14:45

All right. Do you have anything to add Craig?

Craig Francis 14:48

The explanation given by people is they liked the simpler definition of what that was as if it's a string written by the developer. Once you start introducing integers from any source, while it is safe, it made people feel, yeah, what is this. And that's where we also had the slight issue because we had to find a new name for it. And I did the silly thing of sort of asking for suggestions, and then bringing up a vote. And then we had, I think it's 18 to three people saying that it should be called is_trusted, and you have that sinking moment of going, Oh, this is going to cause problems, but hey, democracy. It creates that illusion that it's something more. So that's why we sort of went actually, while I like Scott's idea of having the idea of maybe calling it is_noble. It is a vague concept, which people have to understand. And it's a bit strange. Whereas going back to the simpler, original example, they've all seem to grasp grasp of that one. And we could just keep with the original name of is_literal, which I've not heard any real complaints about.

Derick Rethans 15:53

I think some people were equivalenting is_trusted with something that we've had before in PHP called Safe mode, which was anything but of course.

Craig Francis 16:02

Yes, no, definitely.

Derick Rethans 16:03

We're sort of coming to the end of what to chat about here. Does the introduction of is literal introduce any BC breaks?

Craig Francis 16:11

Only if the user land version of is_literal, which I'm fairly sure is going to be unlikely. So on dividing their own function called that.

Derick Rethans 16:18

Did you check for it?

Craig Francis 16:20

Yes.

Derick Rethans 16:21

So if you haven't found it, then it's unlikely to to exist.

Craig Francis 16:24

There are still private repositories, we can't shop through all their show, check through all their code. But yeah.

Derick Rethans 16:29

Did I miss anything?

Craig Francis 16:31

We covered future scope, which is the potential for a first class type, which I think would be useful for IDs and static analysers. But this is very much a secondary discussion, because that could build on things like intersection types, but we still need to focus on what the flag does. And there's also possibility of using this with the native functions themselves, but we do have to be careful with that one, because, you know, we got things like PHPMyAdmin. We have to be able to make the output from libraries as trusted because they're unlikely to still be providing a literal string at the end of it. So that's a discussion for the future. And the only other thing is that, you know, the vote ends on the 19th of July.

Derick Rethans 17:08

Which is the upcoming Monday. How is the vote going? Are you confident that it will pass?

Craig Francis 17:13

Not at the moment, we're sort of trying to talk to the people who voted against it. And we've not actually had any complaints as such. The only person who sort of mentioned anything was saying that we should rely on documentation and the documentation is already there. And it's not working. I think a lot of people just voted no, because they just sort of going well, that's the safe default. I don't think it's necessary. Or, you know, I'd like the status quo. And we still are trying to sell the idea and say: Look, it's really simple. It's not really having a performance impact. And it can really help libraries solve a problem, which is actually happening.

Derick Rethans 17:46

Is this something that came out of the people that write PHP libraries or something that you came up with?

Craig Francis 17:52

So I've come gone to the library authors and suggested you know, this is how Google do it. Would you like something similar? And we've certainly had red bean and Propel ORM saw show positive support for that. And I've also talked to Matthew Brown, who works on the Psalm static checking analysis. He's very positive about it, so much so that Psalm now also includes this as well. Obviously, static analysis is not going to be used by everyone. So we would like to bring this back to PHP so that libraries can use it without relying on all developers using static analysis.

Derick Rethans 18:25

Thank you very much. Glad that you were both here to explain what this is_literal RFC is about.

Craig Francis 18:31

Thank you very much, Derick.

Joe Watkins 18:33

Thanks for having us.

Derick Rethans 18:37

Thank you for listening to this installment of PHP internals news, a podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening and I'll see you next time.