Renpy Review

(This article is old. I haven't revisited Renpy in years, so my opinions might be different now.)

How Renpy and I met

Renpy is a visual novel engine. I got into it after playing Doki Doki Literature Club, which is made with it. DDLC is far and away the best story I've ever seen and had such a huge impact on me than it was the first game I ever played mods for. There's a large modding community and they've come up with some really great extensions to the story. After playing a few (I got lucky and tried most of the best ones first), I was finally motivated to make one myself.

DDLC review

Hence came MC's Revenge, a project I picked up from another modder who abandoned it for adoption. I finished MCR and the reception was good, so not long later I took up a much larger project that I'd actually been planning while I was making MCR, Return To The Portrait.

MC's Revenge

Return To The Portrait

RTTP was a lot more ambitious and required me to deal with Renpy in a lot more depth, so I got to see the ugly complexities and learned to understand most of Dan's code in the original DDLC. So that's my experience with Renpy.

Okay, this is going to be kind of an unfair review because I'm only counting my grievances. That's genuinely how I feel about Renpy - I just never encountered anything that worked better or more easily than I expected. This is probably because I've never used any other VN engine, so I don't know, maybe Renpy is the best one out there. But I do know that it has a lot of weak areas, many of which seem like they would've been easy to design better.

Complex character sprites aren't well supported

Expresive characters in a VN will have different sprite components that can be mixed and matched for a large number of posssible "poses". The standard of flexibility I've come to see as meager for a main character is: 2 left arm positions, 2 right arm positions, 6 mouths, 2 cheeks (normal and blush), 4 eyes (normal/shocked/side/closed), and 3 eyebrows (normal, angry, scared/remorseful); and you can combine them into hundreds if not thousands of combinations.

Unfortunately, Renpy seems designed with the assumption of sprites vastly less expressive than that, since you have to define each image you're going to use during init with an `image` statement, and for character poses that means a line for *every permutation you're going to use*. You can't dynamically generate their pose tags.

They added the `layeredimage` statement in Renpy 7 as an attempt at a more sophisticated way of handling this, but it's got a slew of its own problems. It makes your tags insufferably long since each component has to be space-delimited *and* have a unique prefix typed out everywhere you use it (compare `sayori u11111` to `sayori ul1 ur1 m1 c1 e1 b1`), it doesn't support composing other layered images as far as I can tell, and it seems to cause incorrect image composition (some layers appear with lines of pixels missing between them that worked with hardcoding `im.Composite`), and so despite how badly I needed this feature for RTTP and how many hours I sank into trying to make it work, I ended up not using it. (Part of the difficulty is that the main DDLC characters also have two different outfits and *secondary poses* which make this even more complex, and one of them has two heads (normal and looking away), both of which can be used on either pose. The `layereredimage` system didn't seem able to accomodate that.)

https://www.renpy.org/doc/html/layeredimage.html

A good visual novel engine would be designed with the assumption that a character will have too many possible poses to hardcode them all, so it would support using a function that converts the posecode as a string to whatever object the engine takes for its `show` statement.

This inadequacy is the whole reason the community member AgentGold had to write the 1800-line `Create_Definitions.pl` for DDLC modders. Implementing pose codes better would have saved most if not all of that tremendous amount of work, as well as hundreds of lines of `image` declarations in almost every VN made with Renpy (which noticeably slow down startup - starting RTTP for the first time can take a whole minute before the window comes up). It would enable developers and their artists to create more flexible and more expressive character sprites.

Too many DSLs... that all do the same thing

I'd expect to run into at least one DSL for a visual novel engine. Script language makes sense. But Renpy has two DSLs besides script language and the `layeredimage` language, and neither of them is simple or intuitive:

screen language

ATL (animation/transformation language)

In screen language, you give each button an `Action` attribute, and the `Action` is syntactically treated as a Python function call. Even the `NullAction` used when you want a button to do nothing is specified with `action NullAction()`.

That's awkward, but the worse problem I have with screen language is that a lot of it seems to just be duplicating the functionality of either Python or Renpy script. Built-in Actions include `Call`, `Jump`, `Show`, `Hide`, `Return`, `SetVariable`, `If` (it takes a condition / action if true / action if false)... any of those sound familiar? There's a screen language version for just about every Renpy script statement I can think of, which is just capitalized and has different syntax. (Note that you also use actual Python flow control constructs in screen definitions, and even outright embed Python code and use variables, so I'm not sure if there's ever a need for `If`.)

Screen Actions

You also can specify multiple actions by passing a *Python list* of actions to the `action` attribute, even though you don't wrap it in a list if there's only one. This is a shining example of why convenience shortcuts can be harmful: if you're learning by example like I was, you see an action specified without brackets and naturally assume that passing a Python list would be either a SyntaxError or a TypeError. If single actions had to be wrapped in brackets, the readability cost of finished code would be minimal and Renpy would be more self-documenting.

There's a lot more to this myriad of redundancy and confusion. Buttons in screen language can be made to show up but not be clickable with their `sensitive` property. *Or* this can be done by using the `SensitiveIf()` action - *as a member in a list of actions*. Because that makes sense. (I learned this way of doing it first, which led me to believe that it was the only way because it's so obviously the worst that I don't think anyone would choose it if they knew any other way.)

Similarly, there's a Python equivalent for almost anything Renpy script does. `renpy.image` can be used from a `python` block (which embeds literal Python code in script) as equivalent to the `image` statement, for example. But the Renpy way is almost always clearer and better, because, of course, that's how it was made to be used.

This all flies in the face of The Zen of Python: "There should be one-- and preferably only one --obvious way to do it." Having this kind of ambiguity only confuses me and forces me to wonder what the difference is and which one I should use.

The Zen of Python

Renpy's the best demonstration I've seen of why features are bad.

The Philosophy of Minimalism

Even after working with Renpy for over a year and getting experience with almost every major area of functionality, there are still a *lot* of things I don't understand about it, and I never will since with any luck I'll never have to work with Renpy again.

ATL doesn't support blocking animations or even uncancelable ones

Okay, so ATL has this distinction between *transformations*, which are for transforming or moving a sprite, and *transitions*, which are for defining how to transition between two images. Transitions are mostly used for things like scene fades. Transforms are used for positioning and focusing characters (enlarge them slightly) when they speak.

Transitions are always blocking and transformations are always non-blocking. This is a reasonable default but not a reasonable hard rule. Sometimes I want to wait for an animation to finish or let other things keep happening while the scene fades.

It gets worse: since transforms are non-blocking, and any transform you apply to a sprite replaces the previous one, transforms get interrupted and left unfinished by subsequent ones depending on how quickly the player moves through.

For example, if I apply my focus transform to a character at the start of their line, which only affects zoom, give them the unfocus transform at the end of their line, and then on the next line I give them an x-move transform to make room for another character to come on screen, it only works if the player waits for the transforms to finish before continuing. If they press space too quickly after the character's line, the x-move transform supersedes the unfocus transform and the character is left half-focused.

There's no way to tell Renpy that a transform should be instantly completed if it gets superseded. You wouldn't believe the shitty ways I had to come up with to hack around this in RTTP. In the original DDLC, Dan's approach was to have every single use of a character animation specify both their destination position and their destination status (normal, focused, sunk). But that approach couldn't work for my script because it doesn't play well with nonlinearity. There are scenes where a player choice affects what position a character's at but the dialogue is mostly the same, so if I'd done it Dan's way I'd have littered every such scene with a vomit-inducing amount of if/else branches.

ATL doesn't support flow control

ATL is the only DSL that's not only wise enough to not reinvent Python flow control with its own unique syntax but also to *not support Python flow control*. If you want to do branching in ATL logic, you have to find some hideous workaround like having two different transforms and the decision logic actually be the in script.

ATL doesn't support non-idempotent animations

You normally use the `pos` (`xpos` and `ypos`) properties to set positions. The `offset` properties (basically `pos` that can't take a fraction of the screen dimensions) are independent of those and could be used to work around this in some situations, but they still have to be idempotent, so at best this could work if you know you'll only need to nudge the character once.

And there's another problem with trying to use `offset` to get around this: since they're independent and transforms don't touch any properties they don't mention, if you later apply a transform that only sets `pos`, `offset` will stay how the previous one left it.

So pretty much, if you ever want to animate a character moving a specified distance instead of to a specified position, you have to keep track of their position. And you know what that means if your script is non-linear...

ATL doesn't allow mixing relative and absolute positions to the same property

ATL parameters interpret position as absolute pixels if it's an int, and a proportion of screen width if it's a float. But of course this means you can't mix them. Want to show a character at 100 pixels to the right of the middle of the screen? Too bad, cause `xpos 0.5 + 100` would mean `xpos 100.5` - 100.5 times the screen width.

It's worse than that. That limitation's at least understandable why it's that way, but you can't even mix them in different places in the same transform. For example,

transform test:
    xpos 200
    linear 1 xpos 0.5

will make a character appear at 200 times the screen width and move to the middle over 1 second.

transform test:
    xpos 0.5
    linear 1 xpos 100

makes the character appear at half a pixel in and move to 100 pixels in. So apparently it just goes by whichever is last.

Again, in some situations you could use `xoffset` to get around this, but that's a timebomb for a few fun hours debugging when you add a transformation that needs to not touch `xoffset`.

ATL doesn't support defining the duration of an animation based on moving at a constant speed rather than the total duration being a constant

I don't think I need to elaborate here. It's not needed that often, but this would've been the first thing I'd think of if I wrote the ATL specification.

The zorder problem

Here's a common problem that needs solving in visual novels: different characters are shown on screen at once, and we generally want whoever's speaking to be "on top".

The zorder solution: each image onscreen has a zorder and images with higher zorders are shown on top of lower zorders. Whenever a character who's partly behind someone else has a line, you increase their zorder to be higher than the surrounding characters'.

The problem with the zorder system is a lack of *purity*, in the functional sense. Since characters' zorders increase throughout a conversation as they need to be put on top of each other (but rarely decrease since it's always the person who's speaking that motivates a change), snippets of conversation have *side effects*, which makes it very troublesome to edit them or write nonlinear scenes, because if you change a snippet of conversation, it can affect the zorders everyone comes out of it with, which means everything from then until the next screen clear could need to be edited to reflect it.

Renpy also has the `behind` keyword to save you from having to keep track of the ugly growing zorder numbers, but it seems to be syntactic sugar on `zorder` (ie. it adjusts the character's zorder to put them behind the given other characters), and so it has its own problem. Imagine I have four characters on screen:

Alice(zorder 2), Bob(zorder 3), Carl(zorder 1), Dana(zorder 2)

Carl needs to start speaking. So I `show dana behind carl`, then `show bob behind carl`. Problem is, that puts Bob behind Alice as well!

In general, to avoid these unintended side effects we need the zorder change to be *on the character who's motivating the change* and not on the characters around them, otherwise the effects can ripple, requiring me to also `show alice behind bob` - three separate `show` statements for one character's zorder change.

`behind` is exactly the wrong solution. An `infrontof` keyword (ideally with an `all` parameter) would've been 100x more helpful.

`persistent` object returns None on accessing undefined fields

No, that's really what happens. Javascript has infected Renpy. Goodbye run-time typo detection.

This isn't even how Python objects work. Someone added extra code to make the global `persistent` object behave in an unexpected and bug-prone way.