- 1 Chapter 1: The Zen of Getting Started
- 2 Chapter 2: GameBoy Architecture in a Nutshell
- 3 Chapter3 : Game Boy Development Tools
- 4 Chapter 4 : Starting With the Basics
- 5 Being One with the Pixel (pp 111)
- 6 Chapter 6: Tile-based Video Modes
- 7 Chapter 7: Rounding up Sprites
- 8 Chapter 8: Using Interrupts
- 9 Chapter 9: The Sound System
- 10 Chapter 10: Interfacing with the Buttons
- 11 Chapter 11: ARM7 Assembly Language Primer
- 12 Conclusions
Introduction
“Programming the Nintendo GameBoy Advance: The Unofficial Guide” by J. Harbour is an e-book on GBA programming covering some of the basic and not-so basic items involved in GBA programming. It's a clear, easy to understand and professional-looking book; one that is used by many people taking their first steps into GBA and if I recall correctly several colleges.
That said, it is not without issues. Yes, it does teach the basics of GBA programming, but it also:
- Focuses on the wrong basics (bitmap modes, which are hardly ever used in commercial games).
- Uses inconsistent nomenclature in code and other bad programming practices.
- Contains code that is simply incorrect (does not compile or worse).
- Gives incorrect advice and information.
- Uses inefficient code, even when optimizing it even a little would not have been difficult, or even recommended considering the GBA platform.
While these items may have little consequence in the short-term, in the long-term they will hinder understanding of GBA programming. I believe that there's more to programming than just getting the thing to work – style and efficiency matter as well. This is especially true when teaching programming, as habits learned early on have a tendency to stick.
For example, code with an loose or inconsistent nomenclature makes it harder to read, write and understand, and therefore harder to maintain. It's the Principle of least astonishment: the fewer surprise elements, the smoother the ride. The problem with inconsistencies is that you never know where you stand: “why is it done like this here, but like that there?”, and so forth. They make it hard for novices to understand the rationales behind the code, and coding will ‘magic’: an incantation to utter to get something done. Code is not magic and you have to actually think about things before acting. This is part of the difference between a good programmer and a bad programmer.
Another example is inefficiencies. Sure, Premature optimisation is the root of all evil, but if a function is so slow to be unusable or if it costs next to nothing to make it faster, I think it's better to just start with the faster version in the first place instead of having to correct it later.
Because of things like this, readers of the book may get stuck at some point and will have to unlearn and relearn how to do things properly before they can really continue. Yes, I know how that sounds, but we've seen many, many instances of that on the forum over the last few years. Tonc has been one of my efforts to weed them out; this is another.
Most of the annotations here aren't actually about GBA programming specifically, but general principles of programming like consistent terminology, efficiency, and basic programming principles. Nearly every demo in the book suffers from these problems. Because later demos often repeat the same problems and I don't want to repeat myself too often, I'll only cover a specific problem once.
The main goal in writing this was to be thorough – probably a little too thorough, as some entries will be more important than others. Still, I think that all the points are worth thinking about at least once.
Now, writing something like this should have been the author's job. When someone else does it, especially ‘the competition’, it might come off as a little mean-spirited. This is unfortunate, but sometimes unavoidable. If instructions lead people into blind alleys, putting up signs pointing this out is a good thing, not a bad thing.
1 Chapter 1: The Zen of Getting Started
No annotations (there's little to annotate about introductions anyway)
2 Chapter 2: GameBoy Architecture in a Nutshell
Incorrect info: Memory architecture (pp 54-59)
The term ‘Access Width’ that's used here is a little ambiguous. For example, table 2.2 lists the access widths of the memory sections as 8 or 16 bits, except for IWRAM, where it says 32 bits as well. The problem here is that there is a difference between bus width and (software) access width. Table 1 lists the information according to GBATek. The bus-size may be 16bit in some cases, but this does not mean the access is restricted to that size; it would just take a little longer.
For example, it simply isn't true that VRAM accesses (pp 58) are restricted to halfwords: words work as well, only it'd take a cycle longer. However, when filling regions of memory you have to loop over a given size anyway, and then word-accesses will actually be quicker because there are only half the loop iterations in that case. If you do it right, you can get a performance gain of up to 8 times.
Region | Bus | Read | Write | Cycle |
---|---|---|---|---|
BIOS ROM | 32 | 8/16/32 | - | 1/1/1 |
Work RAM 32K | 32 | 8/16/32 | 8/16/32 | 1/1/1 |
I/O | 32 | 8/16/32 | 8/16/32 | 1/1/1 |
OAM | 32 | 8/16/32 | 16/32 | 1/1/1 |
Work RAM 256K | 16 | 8/16/32 | 8/16/32 | 3/3/6 |
Palette RAM | 16 | 8/16/32 | 16/32 | 1/1/2 |
VRAM | 16 | 8/16/32 | 16/32 | 1/1/2 |
GamePak ROM | 16 | 8/16/32 | - | 5/5/8 |
GamePak Flash | 16 | 8/16/32 | 16/32 | 5/5/8 |
GamePak SRAM | 8 | 8 | 8 | 5 |
Incorrect info: ARM vs Thumb code (pp 60).
There is a performance benefit to using 16-bit instructions in a computer with 16-bit memory, although I don't get into Thumb code at all in this book, as it is an advanced topic. True, there may be cases for using Thumb instructions to speed up parts of a game, but there are also cases for using regular ARM instructions as well.
Yes, Thumb instructions are better suited for 16-bit memories, which is why Thumb instructions are advised for GBA programming. But it is not an advanced topic, as the only thing you have to do to make things compile as thumb code is use the following compiler flags: -thumb -thumb-interwork. The speed-up is estimated at 60% under almost all conditions.
3 Chapter3 : Game Boy Development Tools
HAM stuff. Can't really comment on that.
4 Chapter 4 : Starting With the Basics
4.1 REG_DISPCNT (pp 105-106)
Nomenclature / classification.
It is better that #defines for register bits use a prefix of some sort that indicates where they belong. Names such as MODE_3 and BG2_ENABLE don't do that, which can become problematic when other things have modes and backgrounds as well.
Volatile registers.
All IO registers should be defined as volatile. Because this would make
the definitions long (volatile unsigned long
, etc), use the common
shorthands like u32, vu32, etc.
4.2 Fillscreen demo: inefficient code / bad coding (pp 106-108)
//# From the fill screen demo. ... //create a pointer to the video buffer unsigned short* videoBuffer = (unsigned short*)0x6000000; ... //macro to pack an RGB color into 16 bits #define RGB(r,g,b) (unsigned short)(r + (g << 5) + (b << 10)) ... void DrawPixel3(int x, int y, unsigned short c) { videoBuffer[y * 240 + x] = c; } ... //fill the screen for (x = 0; x < 239; x++) { for (y = 0; y < 159; y++) { DrawPixel3(x, y, RGB(0,(255-y),x)); } }
The following quote has four excepts from the FillScreen project that are problematic.
Inefficiency: Variable for videoBuffer
Creating an pointer variable to store an fixed address like
videoBuffer
is perfectly legal, but not always recommended
for several reasons. Firstly, it takes a little bit of IWRAM space.
Not much, I'll admit, but if you do this for every section/register,
you're already up to half a kB at least.
Secondly is that it can be dangerous, as the pointer can be redirected
at any time. This can be useful on some occasions, but this probably
isn't one of them.
Thirdly, the extra layer of indirection requires a little bit of extra
work for the CPU, because the contents of global variables need to be
read from memory before they can be used. It won't be a lot of work,
but in the wrong places (like a video-buffer pointer, which is used
every VRAM interaction) it can add up dramatically.
The general method for hardware access is a #define. In this case:
#define videoBuffer ((u16*)0x06000000)
Note the double parentheses here: one for the cast and one for the
whole thing. The second one is necessary because array accesses have a
higher operator priority than casts: (u16*)0x06000000[1]
would be parsed as (u16*)(0x06000000[1])
, which won't
even be compiled. This is probably why the variable is used in the
first place.
Still, in the case of videoBuffer, using a variable might not be such a bad idea because it should be redirected for page flipping. But that's for modes 4 and 5, not mode 3.
RGB Macro
Preprocessor macros are generally considered evil, because there are
so many potential pitfalls there. One of them is that parameters can
be expanded in funny ways because the work by direct text substitution.
For example `#define SQUARE(x) x*x
' with x = 1+2
will give 1+2*1+2 = 5, and not 9. This is why every book that
covers C gives the advice to always, always put parentheses
around the macro arguments. Because it's not done here, the call to
RGB()
later on requires the parentheses –
something which one can't expect the users to know – and
they'd have a right to complain when their code goes wonky.
This kind of code is one of the reasons why most books will urge you to use inline functions instead of macros. Use macros only when you really have to.
The macro is also somewhat unsafe because it doesn't validate the
arguments. The color components have ranges between 0 and 31;
go out of the range and one color will bleed into another. But
people are wise enough not to do that, right? Usually, yes, but
in this particular program this is exactly what happens:
the color used is RGB(0, 255-y, x)
meaning that
the green and blue component are generally outside the safe
range here. Yes, it doesn't matter much here because a funky
pattern is kinda what the author is after, but it'd be nice if
unsafe uses like this had been mentioned.
Function for drawing pixels
There are three kinds of procedures in C: macros, functions and inline
functions. Each of these has pros and cons. The important con of
regular functions is the overhead cost for calling them:
setting up the parameters, 2 jumps between the caller and called
function, and potentially a few others as well. In contrast, macros
and inline functions are integrated into the caller so there's no
overhead. Additionally, because functions are separate entities you
(or the compiler) can't optimize their calls much. Like, say, loading
videoBuffer
once and using it with offsets if you're in
a loop. No, you have to reload it each time.
For most functions, the overhead of a function will be small compared
to its body. However, the body of DrawPixel3()
is one
line. In most cases, the overhead here will cost you a factor 3!
As a rule, pixel plotter routines should be macros or inlines.
Screen fill loop
In C, the x-coordinates of a matrix (such as a bitmap) are
contingent in memory. Because of that, it is better to put the
x-loop inside the y-loop. That way, the code will be
optimized for using consecutive memory. Of course, it doesn't matter here
because DrawPixel3()
doesn't allow optimizing, but when
using a proper mode 3 plotter, the effects should be noticeable. Also,
the boundaries are usually the screen dimensions, not one less.
This gives use the following improvements:
#define videoBuffer ((unsigned short*)0x6000000)
// Macro to pack an RGB color into 16 bits
#define RGB(r, g, b) (unsigned short)((r) + ((g) << 5) + ((b) << 10))
static inline void DrawPixel3(int x, int y, unsigned short c)
{
videoBuffer[y * 240 + x] = c;
}
...
// Fill the screen
for (y = 0; y < 160; y++)
{
for (x = 0; x < 240; x++)
{
DrawPixel3(x, y, RGB(0, (255-y), x)); // NOTE: still unsafe, but meh.
}
}
None of these items are particularly difficult to do correctly, nor do they increase the complexity of the code. Both versions will work fine, but the latter is more robust. In trivial projects this doesn't matter, but once the projects get large enough they will become problematic, and you might have to start all over again to get rid of the bad practices.
This is why people object to this book and many of the other tutorials: they're riddled with things like these. Newbie programmers will come into them and adopt the poor programming standards and only later find out that the practices they use are in fact bad. But by then it may already be too late to do anything about them :\.
4.3 ButtonTest project
Bad coding principle: copy/pasta (pp 111)
The project runs over all ten buttons to illustrate button-presses. For each button there is a separate block of code.
ham_DrawText(3, 3, "DOWN");
//# etc ...
and
if (F_CTRLINPUT_UP_PRESSED)
ham_DrawText(0, 2, "X");
else
ham_DrawText(0, 2, " ");
// check DOWN button
if (F_CTRLINPUT_DOWN_PRESSED)
ham_DrawText(0, 3, "X");
else
ham_DrawText(0, 3, " ");
//# etc ...
As a rule, if you find yourself coding by Copy&Paste, there will be an easier way using loops or lookups. These are usually shorter, and easier to read and maintain. For the initial strings it may be debatable if it's worth it, but for the actual button handling it really should be done.
Unfortunately, in this particular instance it actually does require a little more work because HAM does not have the proper functionality, but looping rather than hardcoding each option is still a good principle.
{
if( (~R_CTRLINPUT) & (1<<i) )
ham_DrawText(0, 2+i, "X");
else
ham_DrawText(0, 2+i, " ");
}
And if you want to be really terse, you can even do it like this, but that's probably pushing it.
for(i=0; i<10; i++)
ham_DrawText(0, 2+i, (~R_CTRLINPUT & (1<<i) ? "X" :" ") );
Bottome line: instead of having 60 lines of code, now you have only 8 or even 2. As you add more cases, the shorter version can really help maintainability.
5 Being One with the Pixel (pp 111)
5.1 Mode3Pixels (pp 133-135)
//declare the function prototype
void DrawPixel3(int, int, unsigned short);
...
//changes the video mode
#define SetMode(mode) REG_DISPCNT = (mode)
...
//packs three values into a 15-bit color
#define RGB(r,g,b) ((r)+(g<<5)+(b<<10))
...
x = rand() % 240;
Prototype formulation
A function prototype is a way to tell the compiler that a function of a certain name exists, what the number and types its arguments are, and what type it returns. The prototype given here does just that. However, it should also serve as a summary for the programmer. Because the declaration here doesn't mention the names of the arguments, it's much less useful than it could be.
Consistency: RGB() macro
Note that the r
term now has parentheses. However,
the others do not, so it's still unsafe to use.
Inefficient ranged randoms
While the use of `rand() % N
' for a random number between
0 and N is a time-honoured tradition, it doesn't really work
well for the GBA. Both rand()
and the modulo use
division, which is a pretty slow operation on the GBA.
A better solution would be to
use `rand()*N>>15
', which may not be as obvious,
but it works just as well (better actually, because the higher bits
tend to be more random) and is much, much faster. It makes use of the
fact that rand()
returns numbers in the range
[0, 215), which can be interpreted as a Q15
fixed-point number between 0 and 1. For even faster randoms, replace
rand()
with your own liner congruent RNG that doesn't use
divisions. You can go from 1200 cycles to about 90 that way.
(tonc:qran)
5.2 Mode3Lines, Mode3Circles, Mode3Boxes (pp 147-149)
As I said before, using DrawPixel3()
as a function
instead of a macro or inline means losing a factor 3 in speed.
It gets even worse for DrawBox3()
. The 3x is the standard
function overhead, but in the case of drawing rectangles, there's
potential for optimization because you're accessing adjacent pixels.
This means the speed loss is more like 3.5 instead of 3. And then
there are simple optimisations that aren't taken here. Filling in
words instead of halfwords would double the speed. Using a fast filler
(DMA/memset16) would increase it even more. Much more.
5.3
Mode3Bitmap (pp 152-154
)
Bad coding: #including data
If there's one bad practice the experienced GBA homebrewers would like to see go away, it's this one.
#include "mode3.raw.c"
The reasons why this is bad can be found at tonc:header. The author mentions this is technically a bad practice and that the proper method is separate compiling & linking, but that it's alright for small projects (at least I think he does; I just can't find the reference anymore). This is true, but the fact is that the many of the homebrewers and students don't know the proper procedure yet, and will adopt this standard, which will then have to be unlearned later.
5.4 Palette discussion
mem variable: paletteMem (pp 156)
paletteMem
should be `#define paletteMem
((u16*)0x05000000)
. Also note the inconsistent naming for
memory: paletteMem
and videoBuffer
. We'll
see more of this later.
Transparent color (pp 156)
Note that it's common practice to keep palette entry 0 set to black.
It is indeed common practice, but not really recommended. the problem with having it black is that it'll hide the pixels that really should appear black. It's better to use a bright, easily distinguishable color, so that it's easy to see what should be transparent and what shouldn't. There's a reason that movies and TV-shows use blue-screen or green-screen for CGI, and not black-screen.
Incorrect info: sprite palettes (pp 156)
“Remember, even the sprites in your game uses this palette …”
No, sprites have their own palette.
Incorrect info: DMA (pp 158)
“However, mode 4 has an advantage of being fast when using hardware-accelerated blitting functions and DMA, because twice as many pixels can be copied to video memory.”
Technically this is true, but the phrasing is a little ambiguous. Obviously, the other modes can use DMA just as well – there's nothing special about mode 4 regarding DMA. Also, it's not really that twice as many pixels can be copied; rather, mode 4 simply uses half the number of bytes for the same amount of pixels. But even with DMA and 2× fewer bytes to copy, mode 4 is simply too slow for general games.
5.5 Mode4Pixels (pp 159-161)
Code inconsistency : RGB() macro
Now all terms are properly parenthesized. If it's corrected here, why's it not done in other projects?
Loop index names
Loop variables are named i
, j
, k
and such (or doubled: ii
, jj
, kk
).
n
and m
are used for sizes, not indices. Though
I'm willing to admit I prefer it that way my background is physics
and not of computer science.
Inconsistent use of functions
Why are we using memcpy()
for the bitmap, but not the palette?
memcpy usage (pp 164)
“Memcpy may not be the fastest method, because it doesn't take advantage of writing two pixels at a time …”
True, it's not the fastest method, but it does write 2 pixels at a time.
Four even. In fact, that's the only reason it works: if it
did copy byte for byte it wouldn't work because of the
no-byte-write rule for VRAM. memcpy()
is a tricky bugger,
it will only work right when the size and alignment are right. See
tonc:memcpy
for details.
“However, it may be faster to copy mode4_Bitmap to the video buffer using a loop that iterates 160x120 times …”
No it won't, because memcpy()
already takes the faster route:
struct-copy by 16 bytes in one go. Simple for-loops will almost always
be slower than dedicated copying routines.
“… (that's half the number of pixels)”
Be careful how you interpret this remark. 160×120 is half the
number pixels of a mode 3 buffer, which is what
videoBuffer
refers to. The only reason you need
this amount now is because videoBuffer
is a halfword
array, not a byte array. It's not that you now need to copy half the
number of pixels, but rather half the number of halfwords. You're
still copying 240×160 pixels, they just take up lewer bytes.
It's a subtle differece, I know, but it's an important difference.
“…but I'm not going into optimization at this point because DMA is faster than any software loop.”
Interestingly, it's actually pretty close. A DMA copy is only about 10% faster than a well-written software copier. For filling, it's actually 10% slower!
5.6 Page flipping (pp 164-165)
Incorrect information: double buffering speed
“Basically, when you draw everything to an off-screen buffer, things move along much more quickly. In fact, using a double buffer makes the drawing operations so fast that interferes with the vertical refresh, so you much add code to check for the vertical blank period and only flip in this period.”
No. The speed of drawing is the same, you just don't see it happen. Page flipping is the solution to interference with the VDraw, not the cause. There simply isn't enough time to render a full 240×160 scene within the VBlank period, so some of the drawing would be done as the screen was being updated, leading to graphical artifacts. What page-flipping does is allow you to do is draw everything on a separate buffer and then simply swap it in. It's the swapping that's fast, but not the drawing.
Also, yes, you need to wait for the VBlank to do this to avoid tearing, but this is not the only reason for waiting. Usually, all timing is done relative to the VBlank.
Mem variables/inconsistent nomenclature: FrontBuffer/BackBuffer
Again, using #defines for FrontBuffer
and
BackBuffer
would be preferable. Also note that the terms
structure of the names are inconsistent with earlier variables
(videoBuffer
was camelcase, and paletteMem
had Mem
instead of Buffer
). This may seem a
minor point, but it would just make everything easier to read and remember
if a consistent terminology were used.
The BACKBUFFER
#define is problematic as well, as it
refers to the bit in REG_DISPCNT
instead of the actual
backbuffer. Again, using a prefix to hint at meaing would be preferable.
5.7 Mode4Flip (pp 165-170)
There are several really nasty bugs in this example.
(volatile unsigned short*)0x4000006;
...
paletteMem = RGB(0, 31, 0);
...
//slow it down
n = 500000;
while(n--);
...
void WaitVBlank(void)
{
while(*ScanlineCounter < 160);
}
ScanlineCounter and WaitVBlank
First, as explained before, use #defines for memory mappings: it's
generally faster and less memory intensive – no sense
in wasting memory if there's absolutely no reason for it. Second,
0400:0006
has a perfectly valid name already:
REG_VCOUNT
. Third, notice the volatile
here.
This keyword is a crucial part of the code, without which it
would not function. What volatile
does is force the
compile to not optimise accesses to the variable. This is necessary here
because the value of the register changes without the programmer's
interference. Since volatile
falls outside the experience
of most programmers, this should have been explained a little more in
the text.
As for WaitVBlank; technically, it's correct: it waits till the VBlank.
However, what's needed for proper animation is waiting for the next
VBlank. If we're already in the VBlank, we pass through this function
immediately. That is why there's that wait-loop inside the main loop,
to wait till we get out of the VBlank again. The correct procedure
would be to check REG_VCOUNT
(tonc:vsync).
Also, note the dereference of ScanlineCounter. This is required because
it's a variable; with a #define, you could build the deref right into it.
Incorrect code: paletteMem redirection
And this is one of the reasons why memory mapping should be done with
#defines instead of pointer variables (or at least with
const
-pointers): pointers can be redirected, like it's
done here. What it should have said here is
`paletteMem[1]
', but as it stands, it makes
paletteMem
point to address 0000:03E0
. If
it was a #define or a const, the compiler would have complained.
Bad coding: Slow-down loop
n = 500000;
while(n--);
The only reason this is necessary is because
WaitVBlank()
is flawed. However, this kind of slow-down
method shouldn't work with modern compilers because the optimizer
would recognize it as an empty loop that does nothing and throw it away.
Inefficient code: DrawBox4()
I admit that the function is comprehensible, but it is so ridiculously
slow that it should not be used in an actual project. Even with
a moderate amount of effort, you can get a function that's more than
30× faster. Remember: DrawPixel4()
checks for even
or odd pixels, but if you're working on a stretch of pixels that's
completely unnecessary because both will be painted on.
The key to a faster version is to consider the odd pixels on the left and right separately, and do a fast filler for the halfwords inbetween.
5.8 Mode5Pixels (pp 172)
Inconsistent/unsafe code: RGB() macro
And now the unsafe version is back! O_o.
5.9 Text (pp 173-)
Miscellaneous: font system
“Now, I realize that many programmers use a bitmapped font, and that's not a bad idea at all, because the font characters can be treated as sprites. However, I prefer a low-memory footprint and more control over the font display mechanism.”
While this indeed a good idea, the font actually has quite a large memory footprint, and there are very few control features.
Memory inefficiency: the font
The font used consists of a partial font: ASCII 32 to 96 (some punctuation, numbers and uppercase letters). This is represented by an array of 64×8×8 halfwords. Each halfword can be 0 or 1. This not only leaves out lowercaps and some of the punctuation symbols, but also wastes 15/16 of the space by using only one bit of each halfword. It should have at least been bytes, and at best a 1bpp bitpacked font (tonc:text). Also, the whole 8kiB font (which generally is constant data) is placed in IWRAM!! Losing a quarter of a very precious memory area for something that's 16× bigger than it's supposed to be it just wrong.
It's fairly easy to take the font out of IWRAM and compress it down by a factor of 2. Simply replace this:
//# Original declaration, in IWRAM.
unsigned short font[] =
by this
//# Replacement declaration, in ROM. const unsigned char font[] =
It's possible to compress this down by 8×, but that take a lot more effort.
Inefficiency: DrawChar
Aside from the slowness of DrawPixel3()
(which has been
covered already) the extra math in assigning draw will probably take
some cycles as well. Pointer arithmetic would be of great assistance
here both in performance and clarity. Yes, clarity too: pointers have
a way of reducing the amount of written code because all the constant
or nearly constants offsets are integrated into the pointer, rather
than added to the code for every single step.
{
int x, y;
int draw;
for(y = 0; y < 8; y++)
for (x = 0; x < 8; x++)
{
// grab a pixel from the font char
draw = font[(letter-32) * 64 + y * 8 + x];
// if pixel = 1, then draw it
if (draw)
DrawPixel3(left + x, top + y, color);
}
}
My own mode 3 text writer uses about 1.35k cycles/char. This is a
lot, but it's manageable. This DrawChar()
does 5k cycles/char.
But wait! That's with the font in fast RAM. If you put it in ROM
where it's supposed to be, the time jumps to 7k/char. That means that
you're out of the VBlank in about 11 characters so you could just manage
to print, say, lives left and the score. In my opinion, this is one
of those instances where optimisation is not premature.
6 Chapter 6: Tile-based Video Modes
Incorrect info: parallax scrolling (pp 203).
“Mode 0 is great for this because all four backgrounds are hardware rendered. You do not need to write your own parallax scrolling routine.”
Yes and no. Yes, mode 0 is great, but you still need to write your own parallax scrolling routine because the scrolling registers only do scrolling, not parallax (parallax is differential scrolling). Aside from that, for big maps you'd still need to write your own scrolling engine anyway. However, it will be easier and faster with the tiled backgrounds than in the bitmap modes, because you'd only have to update one row or column of the screenblocks, rather than the whole of VRAM.
Incorrect info: Tile data and Tile map (pp 204)
‘The tile data itself can be stored anywhere in VRAM, as long as it's on a 16 Kb boundary …”
This isn't quite true. The data doesn't have to be aligned to a charblock boundary, it just makes it easier to use. If placed at an offset, all the map entries would have to incorporate that offset as well somehow.
“… video memory is divided into 4 logical Char Base Blocks, which are made up of 32 smaller Screen Base Blocks …”
(pp 204-205).
No. Well, yes, but a misleading yes. Screenblocks have nothing to do with charblocks; the two are completely separate entities. That they happen to use the same memory addresses is an inconvenience to watch out for, not an indication of something hierarchical.
“The tile map (which defines where the tiles are positioned) must begin at screen base 31 at the very end of video memory.”
This is an unfortunately use of “must”. Nothing requires the tilemap to start at screenblock 31, it just so happens that it does in this example.
Incorrect info: DMA Blitting (pp 209)
“Let's not forget that DMA is a hardware process, where a software blitter is compiled and run by the CPU as machine instructions. You can't begin to compare a hardware process with a software process, because anything that is hard-coded into silicon will blow away a series of machine instructions. … you could write a fast memory copy routine in assembler and it would be much faster than a C routine. However, DMA will blow them both away …”
Actually, we can compare them and, as it happens, DMA will not blow a well-constructed asm copier away. Yeah, I was surprised when I saw that as well. The reason that DMA is faster than regular copies is because there is no loop overhead. However, with ldmia/stmia instructions, the loop overhead per byte is significantly reduced anyway, especially for sections with high waitstates. In most cases, DMA will only be 10-20% faster in copying. It will actually be slower when it comes to fills, because it will re-read the data to fill with every time, which a software version won't have to. It's not safe to assume that a hardware process will always beat a software routine.
6.1 TileMode0 (pp 211-215)
//copy the palette into the background palette memory
DMAFastCopy((void*)test_Palette, (void*)BGPaletteMem, 256, DMA_16NOW);
...
#define BG_COLOR256 0x80
#define CHAR_SHIFT 2
#define SCREEN_SHIFT 8
#define WRAPAROUND 0x1
//background mode identifiers
#define BG0_ENABLE 0x100
#define BG1_ENABLE 0x200
#define BG2_ENABLE 0x400
#define BG3_ENABLE 0x800
...
//background memory offset macros
#define CharBaseBlock(n) (((n)*0x4000)+0x6000000)
#define ScreenBaseBlock(n) (((n)*0x800)+0x6000000)
//create a pointer to background 0 tilemap buffer
unsigned short* bg0map =(unsigned short*)ScreenBaseBlock(31);
...
...
//vertical refresh register
#define REG_DISPSTAT *(volatile unsigned short*)0x4000004
//wait for vertical refresh
void WaitVBlank(void)
{
while((REG_DISPSTAT & 1));
}
...
#define BUTTONS (*(volatile unsigned int*)0x04000130)
if(!(BUTTONS & BUTTON_LEFT)) x--;
Partially improper types: DMAFastCopy()
Part of choosing the type of a symbol is making it easy to use.
For general memory, the most useful types are `void*
' or
`const void*
', because then you won't have to force the
user to cast everything and making the code less readable in the
process. In the case of DMAFastCopy()
, the first argument
should have been `const void *
', because source data
shouldn't be modified. The destination should be `void*
'.
Using these types means that the user won't have to cast explicitly. An
other problem with having identical datatypes here (combined with
not having parameter names) is that you could make the mistake of
switching the arguments and so try to copy from destination to source,
which is generally a bad idea. This is more likely than you might
think, since in the standard method of copying,
memcpy()
, the destination address goes first.
Unnecessary casts clutter up the code, making things more difficult to read. Note that the destination is actually already the correct type, but gets a needless cast anyway. The reason for this is probably that it was copy-pasted blindly from another source, which is one of the bad habits I'm trying to warn against.
Naming inconsistency: REG_BGxCNT defines
BG_COLOR_256
, CHAR_SHIFT
,
SCREEN_SHIFT
and WRAPAROUND
are all part of
REG_BGxCNT
. However, apart from the first name, there is
nothing that indicates this. The reason this is bad is that it can
cause conflicts and/or confusion with other register bits. Bits like
BGx_ENABLE
, which aren't part of the background control
at all, but part of REG_DISPCNT
. Again, this can be solved
by a prefix.
Incorrect #define: WRAPAROUND
WRAPAROUND
is incorrect: it should be 0x2000
.
0x1
is a priority setting
(gbatek:bg control).
And like many other register #defines, it's impossible to tell
what WRAPAROUND
really refers to. Mapping systems
and text writers can have wrap-around as well, for example.
Improper types: Char/ScreenBaseBlock
In nearly – and perhaps all – cases, char and screen block addresses are used as pointers, not as raw addresses. It'd make sense to put types inside the macros, rather than forcing the user to add them explicitly.
Consistency: BGPaletteMem
The earlier PaletteMem
was a variable, now it's a #define.
It should be a #define, but it'd be nice if there was some
consistency in these things throughout the book.
Incorrect #define/info: REG_DISPSTAT
Well, the definition is correct, but not the comment above it.
0400:0004
is the display status, there is no such thing
as a vertical refresh register. Although there is a bit inside
REG_DISPSTAT
that checks the VBlank/Draw status, the
register has other functions as well.
Inconsistency/Incorrect function: WaitVBlank
This is the second version of WaitVBlank()
, and again it
does not do what it should. Nor does it do what it says. bit 0 of
REG_DISPSTAT
checks for VBlank status: if 1, then we're
inside the VBlank
(gbatek:dispstat).
So unlike what the function name indicates, it actually waits till the
VBlank is over, not until it starts. And as a timing mechanism,
it's still useless for reasons described earlier.
Inconsistency/Incorrect #define: BUTTONS
The proper name is REG_KEYINPUT
or REG_KEYS
or something else that starts with REG_
. More importantly,
REG_KEYINPUT
is a vu16
, not vu32
.
Improper/inefficient code:
REG_KEYINPUT
uses active-low settings (which should have
been mentioned here as well somewhere, as it's not exactly obvious).
This means that when a button is down, the bit is clear (0),
and not set (1). To go to an active-high setting, one should invert
the bits and then mask: `~BUTTONS & foo
'.
That says “check if button foo is down”. The
code, `!(BUTTONS & foo)
', checks whether foo
is not not down. Technically it's the same, but the formulation
is awkward.
Aside from that, a logical NOT and a bitwise NOT are not the same. In most cases the bitwise version will be quicker, though ultimately the difference is negligible.
6.2 RotMode2 (pp 218 - 224)
Improper types: REG_BG2PA-PD,-X, -Y
The affine registers are signed, not unsigned. Technically, it
doesn't really matter here because these things are write-only, but
people will base their code on this, use the unsigned types, and then
see the screen go wonky because a signed 0xFFFF
is not an
unsigned 0xFFFF
.
And yes, the proper term is ‘affine’, not Rotation or Rot/Scale. The registers form a general 2×2; affine transformation and while rotations are affine transformation, not every affine transformation is a rotation. The term is incorrect and misleading.
Unexplained terms: RotateBackground
Yes, the function has something to do with rotation and scaling, but nowhere is it actually explained what the terms are and what it actually does. When dealing with math, you must always define your terms clearly. This is especially true for the GBA case, where the affine transformations work a little differently than you may expect.
In any case, the function basically implements
tonc:eq 12.4,
with
α = −ang
,
map anchor p0 = (scroll_x
,
scroll_y
) and
screen anchor q0 = (cx
,
cy
).
The formulation and order of the terms in the equations is …
confused. When it comes to writing down matrix equations explicitly,
it is better to have the first column (the terms multiplied by
x_center
) before the second column (the
y_center
terms). Technically it doesn't matter, but it
makes it hard to visualise what's going on.
void RotateBackground(int ang, int cx, int cy, int zoom) { center_y = (cy * zoom) >> 8; center_x = (cx * zoom) >> 8; DX = (x_scroll - center_y * SIN[ang] - center_x * COS[ang]); DY = (y_scroll - center_y * COS[ang] + center_x * SIN[ang]); PA = (COS[ang] * zoom) >> 8; PB = (SIN[ang] * zoom) >> 8; PC = (-SIN[ang] * zoom) >> 8; PD = (COS[ang] * zoom) >> 8; }
Also, do not use all-caps for variables. These are reserved for constants and macros. One or two-lettered global variables is probably also not a good idea.
Inefficiency: sin/cos tables should be power-of-two
Power-of-Two-sized (PoT) arrays for periodic functions make things a lot easier, because you can then simply mask the index to keep from going out of bounds.
WaitVBlank() version 3
{
while(!(REG_DISPSTAT & 1));
...
while(REG_DISPSTAT & 1));
}
The necessary components for a correct vsynch are here, but not in the right order. The first part actually waits till the VBlank is over, not when it starts. As a result, all the updating is done inside the VDraw period, not after it. Switch the two statements and it should work correctly.
7 Chapter 7: Rounding up Sprites
Missing info: Converting tiles (pp 231)
gfx2gba can be used for object tiles as well. In fact, it's notably better than pcx2sprite. (Though grit is better then either here, of course :))
7.1 SimpleSprite (pp 236 - 241)
Terminology: SpriteMem, SpriteData and SpritePal
While it's good that they're clearly indicated as having something
to do with sprites, There is the potential for confusion between
SpriteMem
(i.e, OAM) and SpriteData
(Object VRAM), because Mem
and Data
are
pretty much synonyms.
Also, we have now three different nomenclatures
for palettes: paletteMem
, BGPaletteMem
and
SpritePal
.
Inconsistent/confused terminology: Obj attribute bits
Once again, it is important to indicate where bit defines belong,
which isn't done here. For example, COLOR_256
sets
the color-mode of an object to 256-color, but backgrounds have
a similar option, but at a different bit. Yes, it's called
BG_COLOR256
there, to avoid the naming collision,
but unless you already knew that, you might be tempted
to use COLOR_256
in both cases, or assume that there was
an OBJ_COLOR256
. The same goes for the
MODE_x
names, which might lead one to think that
they're similar to video mode bits and hence belong to
REG_DISPCNT
. Even better, the once that do have an
OBJ
prefix actually belong to REG_DISPCNT
,
not object attributes.
Improper types: x, y
Local variables should be int
s, unless you have a
really good reason to use something else. ARM cores are 32bit,
so use 32bit where you can. For a little more on this, see
tonc:
good/bad practices.
Inefficiency: UpdateSpriteMemory
{
int n;
unsigned short* temp;
temp = (unsigned short*)sprites;
for(n = 0; n < 128*4; n++)
SpriteMem[n] = temp[n];
}
This is a pretty slow implementation of an OAM updater. Using
u32
pointers instead of u16*
would double
the speed. Using dedicated copiers (like DMA) would increase the
speed by a factor of 5 to 10. Do not use manual copies unless you're
working on small ranges; use memcpy()
, DMA or
CpuFastSet()
.
7.2 BounceSprite header (pp 243 - 245)
Magic numbers: SpriteData3
//video modes 3-5, OAMData starts at 0x6010000 + 8192 unsigned short* SpriteData3 = SpriteData + 8192;
Magic numbers in code are generally a bad idea. When it comes to
tile addresses, magic numbers are very common for some reason. At the
very least, one should make a macro that takes care of the raw math,
so that you just have to enter the charblock and tile index to get the
right address. A nice trick would be to define TILE
and
CHARBLOCK
types so that you can map VRAM into tiles.
(see tonc:tileblocks).
This allows both easy addressing and copying.
But even if you do use magic numbers, it's better to use hexadecimal; 8192 just isn't a very nice number. In hex it's 0x2000, which is easier to remember.
Also, you have to be very careful with datatypes and because pointer addition, because the addition works by type, not by byte. In particular, SpriteData+0x2000 = 0x06014000, not 0x06012000, because the datatype of SpriteData is u16. The code is correct here, but the comment is not.
7.3 BounceSprite main.c (pp 246 - 251)
void MoveSprite(int num) { //clear the old x value sprites[num].attribute1 = sprites[num].attribute1 & 0xFE00; sprites[num].attribute1 = sprites[num].attribute1 | mysprites[num].x; //clear the old y value sprites[num].attribute0 = sprites[num].attribute0 & 0xFF00; sprites[num].attribute0 = sprites[num].attribute0 | mysprites[num].y; } ... //draw the background for(n=0; n < 38400; n++) videoBuffer[n] = bg_Bitmap[n]; //set the sprite palette for(n = 0; n < 256; n++) SpritePal[n] = ballPalette[n]; //load ball sprite for(n = 0; n < 512; n++) SpriteData3[n] = ballData[n];
Incorrect/bad code: MoveSprite
C has special, shorthand operators for things things like
`x = x op y
'. They're called compound operators,
and look like this: `x op= y
'. In the case of OR,
that would look like `x |= y
', rather than
`x = x | y
'. The shorthand versions are
generally preferred, as the statements are easier to parse and less
error-prone.
Additionally, the new sprite coordinates should be masked. If not, they can overwrite the rest of the bits when entering negative numbers (see tonc:obj-position).
7.4 TransSprite (pp 258 - )
Too specialized design methodology: InitSprites()
For general functions, make sure that what they do
makes sense and don't set bits that have no place in them. For example,
InitSprite()
forces objects to be transparent. This
should be dealt with via a parameter, not forced. It's probably better
for everyone to just have a number of parameters that mimic the
attribute settings. This limits the amount of function arguments and
shortens the code significantly.
Incorrect functionality: SetTrans() and SetColorMode()
{
mysprites[num].trans = trans;
sprites[num].attribute0 = mysprites[num].colormode |
mysprites[num].trans | mysprites[num].y;
}
void SetColorMode(int num, int colormode)
{
mysprites[num].colormode = colormode;
sprites[num].attribute0 = mysprites[num].colormode |
mysprites[num].trans | mysprites[num].y;
}
Only from the code could you tell these have something to do with
objects. Aside from that, what they do is not general enough. What
should be done here is mask out the relevant bits and then mask the
new bits in. As it is now, they will only update attr0
with position, transparency and color-mode settings, and erasing the
other bits, like the shape information. It just happens to work out
for square objects.
Magic numbers: REG_BLDMOD and REG_COLEV
It'd be nice to make some register #defines for these things. Not
everyone will know that (1<<4)
will mean object
top-layer transparency.
Inefficient struct design: SpriteHandler
While using int
s is usually good, one place where the
case isn't always so clear-cut is in structs. The thing about structs
is that the cost memory space, and IWRAM space at that in the case
of global variables. The SpriteHandler
struct takes up
8 words, and the array 128*8*4 = 4 kiB. That's already 1/8th of IWRAM.
Now, sometimes this is okay because, well, sometimes you just need
large amounts of data. But the members alive
,
colormode
and trans
are all single bits,
and size only requires 2 bits. So about half of the struct is always
empty. The solution is bit-packing, which is how most of GBA memory
works anyway. It should be done here as well.
Bad names: dirx/diry
These aren't directions; they're velocities. Directions don't have magnitudes; vectors like velocities do. Additionally, directions don't necessarily imply movement: positions have a direction too, as does looking.
7.5 Rotation and Scaling (pp 265 - 266)
Incorrect explanations
“Is there really any need to draw pre-rotated sprites anymore when support for rotating a sprite is built in to the GBA hardware?”
Yes, it is. There are only 32 affine matrices for objects, which may not be enough of all your sprites. Also, the affine transformations don't give very smooth results and have a number of artefacts that can look very ugly indeed (tonc:affine objects).
“The process isn't perfect because the GBA doesn't have a floating point processor …”
That's not the reason it's not perfect. The process is one of sampling; sampling can cause artefacts because that's just how sampling works, unless you do some funky anti-aliasing.
“… so all the rotation must be done in fixed-point math.”
Which is a non-obvious subject for most nowadays, and should be explained a little. Preferably when it's first used, which was in chapter 6.
“This is a refinement over the SIN and COS arrays you saw in the previous chapter, as there is no longer any need for a source file containing these radian values since they're just computed at the start of the program. This does cause a slight delay ”
Apart from the delay, everything in this sentence is wrong. What's described here is to precalculate the sin/cos arrays in-game, rather than precalculate them on the PC. This does cause a slight delay because the GBA is very bad at floating point math, which is necessary to build the arrays. But that's the least of your trouble. Because they're build in-game, the arrays will go into IWRAM as well instead of ROM. In this case, That's 2*360*4 = 2880 bytes of IWRAM wasted on what's essentially constant data. Furthermore, the amount of memory these arrays can be significantly reduced for the following reasons.
- You only need either a sine or cosine table, as they're basically the same.
- At .8 fixed-point, you'll only have 9 significant bits. The other 23 are essentially wasted.
To build them outside the game and link them in is the right thing to do. To build them in-game is not a refinement, it's a detriment.
Improper typing/naming: RotData
The types here should be signed (tonc:affine-types)!
7.6 RotateSprite
Wasted memory / missing member: SpriteHandler (pp 270)
There are two new members here: rotate
and
scale
. Contrary to what you might think, rotate
isn't the angle; it's the flag marking the sprite as an affine sprite.
The real angle is kept in a member called angle
, which is,
in fact, missing from the SpriteHandler
definition, so
the code wouldn't compile.
Button #defines (pp 271)
It would be much easier if these were in hex, or even binary to highlight the bitwsie nature of the constants.
#define BUTTON_B 0x0002
#define BUTTON_SELECT 0x0004
#define BUTTON_START 0x0008
#define BUTTON_RIGHT 0x0010
#define BUTTON_LEFT 0x0020
#define BUTTON_UP 0x0040
#define BUTTON_DOWN 0x0080
#define BUTTON_R 0x0100
#define BUTTON_L 0x0200
Fragile code: InitSprites() and ROTDATA() (pp 274)
The function uses ROTDATA(tileIndex)
to indicate the
affine matrix number. Affine indices max-out at 32; tile indices can
be up to 1024 and will be greater than 32 very often, making it a
bad idea to use tileIndex
like this. Also, the
ROTDATA()
macro does not mask-out its input, so that the
size-bits may be overwritten.
Inefficient/incorrect code: CheckButtons()/ Pressed() (pp 276)
It's a good practice to read REG_KEYINPUT
into a variable
and use that for interaction. This way, you can go to active-high
status, and do more complicated things like test for key-helds and
releases and the like
(tonc:keys-advanced).
However, the way it's done here makes little sense and only
to complicate matters. All you need is a simple routine that ANDs a
bit-mask. This will also have the benefit of being able to check for
multiple keypresses in one go, something which the switchblock does not
allow.
Also, for some reason some array-deferences ([1] - [4]) have gone missing. The code would not compile.
Incorrect comments: SetMode() (pp 277)
//set the video mode--mode 3, bg 2, with sprite support SetMode(2 | OBJ_ENABLE | OBJ_MAP_1D);
This actually sets the video-mode to 2 (which is a good thing because
the tile indices would be 512 or higher, making InitSprites()
fail) and no background.
7.7 Animated Sprites (pp 279 - 280)
Incorrect information frames and OAM
“The easiest way to animate sprites is to copy a particular frame on an animation sequence into OAM so that it is rendered during the next screen refresh.”
Not quite. The easiest and quickest way would be to have all the
frames in VRAM already, and update the tile index. This won't always
work because there may be too many frames, in which case you'd have
copy new frames into VRAM. Note, copy into VRAM, not OAM. OAM stands for
Object Attribute Memory, which is at 0700:0000
. The animation
frames go into 0601:0000
, which is part of VRAM.
7.8 AnimSprite main.c (pp 283 - 289)
Incorrect code: WaitVBlank() (pp 286)
{
while(!(*ScanlineCounter));
while( (*ScanlineCounter));
}
This is the fourth, incorrect version of WaitVBlank()
.
This actually waits until the start of the first scanline, rather than
the VBlank.
Improper types/magic numbers: UpdateBall()
{
u16 n;
//load ball sprite
for(n = 0; n < 512; n++)
SpriteData3[n] = ballData[(512*index)+n];
}
Do not use non-ints for loop variables. In this case, it slows down an
already slow loop by another 10-20%. And then there's
the matter of the magic 512 … what is it? Also, the
index
parameter has no datatype.
8 Chapter 8: Using Interrupts
Incorrect info: timer importance
Then I talk about the all-important subject of timers, how to slow down your program to a consistent framerate. This has been something of a problem in the prior chapters (aside from using the VBlank), but now you will have the means to correct it.”
The GBA timers are generally not used to keep a consistent frame rate. Synchronizing should use the VBlank, the problem in the prior chapters was that none of the projects did this correctly.
8.1 Using Interrupts (pp 296)
Incorrect info: software interrupts (pp 296)
“A software interrupt is common in a multitasking operating system like Windows 2000 or XP. Since the GBA is a video game machine, as you may have expected, all interrupts occur on the hardware side.”
This is not true. There are indeed software interrupts, also known as BIOS calls (gbatek:bios-functions).
Incorrect info: REG_DISPSTAT{VCOUNT} (pp 297)
The VCount trigger in REG_DISPSTAT
used bits 8-15, not
6-15.
Incorrect info: bit-ops (pp 298)
“The hexadecimal values in this list of definitions allow you to perform a bitwise AND with the REG_IE register in order to set the specific bit …”
This should be bitwise OR to set.
Not quite correct info: multiple interrupt timings (pp 298)
One and only one interrupt will occur at a time! So the REG_IF register will only have one bit set, not several. …”
This isn't completely true. Yes, only one interrupt will fire at a time (unless you get really unlucky), but multiple bits in REG_IF may be set. This can occur when dealing with nested interrupt routines, or if you get an interrupt but don't acknowledge it.
Incorrect info: acknowledging irqs (pp 299)
REG_IF |= INT_TIMER0;
To acknowledge an interrupt, you need to write that bit to REG_IF,
not use an OR-EQ. So that's `=
', not `|=
'.
Yes, this is a little odd, but that's how it really works. Using
`|=
' would in fact clear all the interrupts.
Code redundancy (pp 300)
if((REG_IF & INT_HBLANK) == INT_HBLANK)
When using a single-bit mask, you don't have to use the ==
and such. After all, it can be nothing else but that or 0, and these
will be enough.
8.2 InterruptTest (pp 301 - 305)
Improper typing of registers
IO registers should always be volatile. Especially interrupt registers. That means you, REG_IE, REG_IF and REG_IME.
Bad code for interrupts: MyHandler plotter (pp 305)
The object here was to draw a random pixel at every HBlank. As the
author indicated, “… hblank code must be fast!”. So
the last thing you should be doing is call rand()
3 times, call modulo 3 times, use a slow function, and do all these
things in ARM code from ROM. The whole routine will probably take
about 4-8 scanlines to execute.
8.3 Using Timers (pp 306 - 307)
Incorrect code: timer check (pp 306)
timer = REG_TM0D; if (timer % 65536) { // overflow--time to deal with it }
Timers run continuously so they will be a multiple of 65536 exactly is next to zero (well, 1/65536, really). Meaning that this code will execute pretty much all the time.
8.4 TimerTest (pp 307 - 314)
int timers; timers[0] = REG_TM0D / (65536 / 1000); timers = REG_TM1D; timers = REG_TM2D / (65536 / 1000); timers = REG_TM3D;
Improper types: timers
Again, array operators have gone somehow.
Problematic code: integer division
Because the 65536/1000 division is done first, you get a 1% inaccuracy
since this rounds down to 65. It'd have been better to use
`REG_TM0D*1000/65536
'.
8.5 The Framerate program
Incorrect info: VBA accuracy (pp 315)
“I think this program demonstrates that the VisualBoyAdvance emulator is working perfectly, because a consistent framerate of 60 FPS comes through when the VBlank is used.”
VBA is quite inaccurate in its timings. Keeping a consistent frame rate is the easy part for an emulator, because you just update REG_VCOUNT and those kinds of timers with the PC clock.
Incorrect code: framerate counter (pp 327)
frames++;
if (timer > 999)
{
...
//display frame rate
sprintf(str, "FPS %i", frames);
Print(1, 1, str, 0xFFFF);
frames = 0;
}
This code doesn't work as a framerate counter. The timer is set to 256 clocks, in other words about 1098 timer increments per retrace (308*228*4 / 256). The timer variable here has a maximum of 65536/65 = 1008. And on average will increment by 1098/65 ˜ 18. Because the check is done at 999, it can miss the overflow, using more frames in a 'second'. If you want a more accurate second timer, use cascaders, like here: (tonc:timer-demo).
9 Chapter 9: The Sound System
Don't know too much about sound programming, so I'll leave this chapter for later.
10 Chapter 10: Interfacing with the Buttons
10.1 Detecting Button Input
Incorrect info: REG_KEYINPUT size (pp 361)
“The button status value at 0x04000130 is a 32-bit number.”
No, it's not. REG_KEYINPUT
is 16-bit. The bits at
0400:0132
belong to REG_KEYCNT
.
“By the way, almost all locations are 32-bit because the GBA is a 32-bit machine.”
Yes, the GBA is a 32-bit machine, but that has little bearing on how memory is being used or the bus sizes. It's true that the addresses themselves are 32-bit numbers, but most of the time they're accessed as halfwords.
Improper variable/inconsistency: BUTTONS (pp 362)
Again, a variable where a direct #define would have been better. In earlier projects, it was indeed a #define. One other benefit of having it as a #define is that you can also deference the pointer in the macro, rather than having to do it yourself all the time.
One interesting bit here is that the impotance of volatile
is finally mentioned.
10.2 use of REG_KEYINPUT (pp 362-368)
This section covers an investigation of how REG_KEYINPUT
(aka BUTTONS
) works. While technically unnecessary as it's
clearly explained in the reference documents
(see gbatek:keyinput),
it is an interesting idea. However, when investigating things on a
low level, it is much handier to start looking at hex or binary values
rather than decimals. Bit-patterns are much easier to see in hex/binary
than in decimal numbers.
Specifically, it makes a loop like this unnecessary:
{
//check for button presses
for (n=1; n < 1000; n++)
{
if (!((*BUTTONS) & n))
{
sprintf(str, "BUTTON CODE = %i ", n);
Print(10, 40, str, BLUE);
break;
}
}
}
If one prints out the ~(*BUTTONS)
value in hex, it's
immediately obvious which button corresponds to which code.
{
// Print out code for pressed button(s)
sprintf(str, "BUTTON CODE = %04x", ~(*BUTTON));
Print(10, 40, str, BLUE);
}
10.3 Creating a button handler
Separating buttons into an array (pp 374)
Since button input is a very low-level aspect of programming the GBA, it's helpful to move the actual memory reading code into a function and store the results of the buttons in an array. … The main benefit of polling all the buttons at the same time is that you are more likely to lose a button if there is a lot of code between each poll.”
The motivation is good, the implementation isn't. It's already covered in an earlier chapter, but if you're only interested in the status (that is, a one-bit value), it makes sense to keep it in bits and pack it into one variable. This makes subsequent use so much easier than if they're separated into different variables, not to mention faster. As far as I can see, the only time a button-array is useful is if you need to know how long they're pressed or something like that.
10.4 The ButtonHandler Program (pp 375-)
Bad info: key releases
“As you have seen, the button handler need not be complicated unless you want to detect button releases separately from presses.”
Indeed, it doesn't have to be complicated, but the handler in the book actually is more complicated than necessary. All you need is two variables to store successive keystates in, and a few bitmasks (see tonc:keys). That's all that's required to check the state, and transitions like releases.
“… because when it comes to GBA coding, all that matters is responding to a button press.”
No. Key releases are very important for things like charging and combos.
11 Chapter 11: ARM7 Assembly Language Primer
Incorrect name of chapter
The basic idea behind a language primer should be to show how the language works. This chapter doesn't. There is next to no information about the instruction set, what it can and cannot do, or how to properly integrate it with C. What is does do is show how you can assemble a file, and show some very slow assembly routines. For a more useful introduction in the nuts and bolts of ARM assembly, see tonc:asm.
Bad practice: batch files (pp 396)
While batch files are indeed easier to work with than makefiles for small programs, for anything beyond the most simplistic stuff makefiles are far better. Giving them up for batchfiles is a step in the wrong direction. Using batchfiles like it's done here – as the place to collect the compiler flags – is a particularly bad idea because it means you still have to run several of them to complete a build.
Bad info: assembly difficulty (pp 400)
“Indeed, assembly is difficult to master, because each assembly instruction translates directly to a CPU instruction!”
The basics of ARM assembly is actually pretty easy. In many ways, easier than C because the format is simpler. There are no chances of operator precedence snafus or typing problems, because each operator is an instruction and there is only one type: the word. What is difficult is mastering it, and warping your mind into the right state to make proper use of it.
Bad code: linker batchfile (pp 403)
The linker batch file can only be used on one file, but Real Projects usually have multiple files.
As an example of how things should go, see the devkitPro or third-tier tonc template makefiles. If these are a little too advanced, use the second-tier tonc makefiles, which are still relatively easy to understand for makefile newbies.
11.1 FirstAsm Program (pp 406)
I can't be 100% sure, but I'm quite confident that this code of this
was simply taken directly from the gcc output. The presence of
decimal numbers for addresses is usually a dead giveaway. These
should have been converted to more reader-friendly hex-values.
The value 67108864
means nothing; but its hex
representation, 0x04000000
is, of course, recognizable
as the IO register base. The same is true for 33554432
.
According to the comments, it's supposed to be VRAM, but
33554432
is 0x02000000
, not
0x06000000
. So I guess something went wrong here.
Also note the complete lack of explanation of the code, and why it's
written the way it is. Why, for example, you can't just do
`mov r2, #1027
', but have to use a mov
and
an add
? Well, it so happens that ARM cores can't use
use constant values directly if they span more than a byte. It's
things like these that make assembly tricky to use, and what
explanatory texts should focus on.
11.2 DrawPixel32 (pp 409)
.ALIGN
.GLOBAL DrawPixel32
DrawPixel32:
stmfd sp!,{r4-r5} @ Save register r4 and r5 on stack
mov r4,#480 @ r4 = 480
mul r5,r4,r1 @ r5 = r4 * y
add r5,r5,r0,lsl #1 @ r5 = r5 + (x << 1)
add r4,r5,r3 @ r4 = r5 + videobuffer
strh r2,[r4] @ *(unsigned short *)r4 = color
ldmfd sp!,{r4-r5} @ Restore registers r4 and r5
bx lr
Incorrect name
The function has nothing to do with anything 32-anything. Usually, a 32 affix means there are 32-bit access somewhere, but this function doesn't have them. What's probably meant is that this was done with the ARM instruction set (32-bit) instead of Thumb (16-bit instructions), but that's mostly irrelevant from C's perspective.
Improper typing: videobuffer
The videobuffer parameter is an u32. Should have been a pointer. For the asm it doesn't matter, but it makes the use in C easier.
Inefficient code: DrawPixel32()
stmfd sp!,{r4-r5} @ Save register r4 and r5 on stack
mov r4,#480 @ r4 = 480
mul r5,r4,r1 @ r5 = r4 * y
add r5,r5,r0,lsl #1 @ r5 = r5 + (x << 1)
add r4,r5,r3 @ r4 = r5 + videobuffer
strh r2,[r4] @ *(unsigned short *)r4 = color
ldmfd sp!,{r4-r5} @ Restore registers r4 and r5
bx lr
IBut this is possibly the worst implementation for plotting mode 3 pixels you could find. I'm sorry, but that's just how it is.
The code
has 8 instructions; only 5 are necessary. You can do everything with
4 registers so that it's not necessary to use the stack. The
multiplication can also be replaced with something simpler, because
it can be done with shifted rsb
and add
's.
It should have read:
void DrawPixel32(u32 x, u32 y, u32 color, u16 *dst);
...
add r0, r3, r0, lsl #1 @ r0= dst+x
rsb r1, r1, r1, lsl #4 @ r1= y*15
add r0, r0, r1, lsl #4 @ r0= &dst[x + y*15*(32/2)]
strh r2, [r0] @ dst[x+y*240]= color
bx lr
In fact, that's what I get when I get from devkitArm's compiler now. The whole point of using assembly is to create code that'll run faster than a compiled equivalent. If you're making code that's actually slower, you're doing something very, very wrong. While it can be argued that optimised assembly is beyond the scope of a primer, in the case of assembly that's probably not the case as optimisation is the point of using assembly.
12 Conclusions
Well, that's about it I guess. The points mentioned here are the ones I could find on a first full read. While I admit that some of these points are more valid than others, I do think that each of them is worth thinking about. In the future I may add one or two points I've missed, or remove ones that are really just nitpicking.
Pingback: blog
Thanks for this. We used this book in a course for school, and I've been trying to explain why the book wasn't a great reference.
Pingback: Jessie
Pingback: covering letters format
It's easy to stick a flag on top of a building, making observations about all of the design flaws in the building while taking the elevator ride up to the top, but in the end, all you have done is affixed a flag on top.
First of all, this e-book was professionally edited and formatted, and then given away for free. I asked for a donation for the sources but have given away many copies of the sources for free to anyone who asked.
Secondly, the book was written to be easy to understand, not highly optimized. The material is very challenging, very low level, and maybe didn't succeed at being easy to understand. There are some glaring omissions, like lack of a decent sprite animation example with multiple sprites. After all these years, I've added a few more example games and demos, including a very nice sprite class and sprite handler, as addendum.
That being said... what's the point of this website? Did you pay for the book? Are you helping anyone by pointing out all the flaws in the free book? I disclaimed it in the introduction and first chapter that this book would teach the basics of the GBA. Even though it's not optimized code, I have gotten emails from guys who have used it to get a job working on handheld projects at game studios. I got an email from someone at Firaxis Games about it. There are teachers out there using this book and MY sources to teach courses without even asking, without even crediting MY hard work, let alone sending me a little email like "Hey, using your GBA stuff, it's been helpful, thanks." Nada. Nothing but takers...like YOU.
I have a suggestion: Write your own book. If it's so easy, as you seem to believe with this errata-or-whatever site, I'm sure you would have no problem writing a book yourself--one that will be 100% balanced on both sides of the equation.
You're just angry cos you know he's right.
Hey, pointing out flaws makes things better. Where would Open Source be without it?
Also, if you can't handle (a little) criticism, don't write anything on anything. I could see him ripping much deeper.
(And e-ghasp, he never attacked you personally, just your writing!)
"That being said... what's the point of this website?"
well, for starters... this WEBPAGE (not the site) was made to point out to absolute newbies that your book has some (serious) flaws in it, and that if they DO happen to learn off your book, to be wary of the flaws.
"I have gotten emails from guys who have used it to get a job working on handheld projects at game studios."
I believe that but... did they keep using YOUR coding practices? Doubt it, 'cause, like the page pointed out, the practices are... lame to say the least.
"Nada. Nothing but takers...like YOU."
Umm... you DO realize that's a personal insult yet he never insulted you, right?
"I have a suggestion: Write your own book. If [...]"
He pretty much did. Look at TONC, or are you THAT angry at him for writing up a great GBA tutorial?
Except that I'm not planting any flags; I'm trying to point out the flaws in the building. If in a building the floors are uneven, the support structures are rotten and the elevator cable is showing signs of fatigue, it's worth pointing those out to the tenants. In fact, it could be considered irresponsible not to.
[br]
I'm not attacking the book on its formatting, spelling, grammer or layout; I'm talking about its mistakes and bad programming practices. The fact that it's free does not make those things go away.
[br]
The book is easy to understand; that's not the problem. The problem is that much of the information is incorrect or leads to poor code. And I'm not asking for ‘highly’ optimized. I understand that optimization often comes at the cost of readability, but when it doesn't you might as well give the better version. This ultimately helps the readers, as they won't have to spend time finding a suitable replacement when the original function proves insufficient.
For instance, the simple addition of
inline
to the pixel plotters already speeds things up by roughly a factor of three. This kind of thing comes at no expense to the reader at all. (And yes, I know that this is indeed done in the demos written later. But it's still absent from the book.)Another example is the mode 3 text system. The font that's advertised as having a low memory footprint is actually sixteen times too large and takes up a substantial bit of IWRAM. A 1 bpp bitpacked font would have worked out much better. If bitwork is considered too difficult, the font could have at least been compressed to bytes rather than halfwords. This could have been done by changing the declaration from `
unsigned short font[]
' to `const unsigned byte font[]
'.[br]
Yes, I am.
I care deeply about correctness, efficiency and taking heed of recommended practices. I believe these things to be beneficial to the quality of whatever one is working on. This is especially true for GBA programming, where resources are sparse. It's part of why I like it.
The point is, sloppiness is bad, and it deserves to be pointed out. this way, people with less experience can both avoid the bad practice, and perhaps recognize it when they encounter it somewhere else. I really do not see how this can be considered a bad thing.
[br]
When it comes to learning something new, most people will simply copy what their sources tell them. In most cases this works out fine, but if the information has flaws, those flaws will be copied as being factual and possibly spread.
The GBA homebrew scene is a good example of this. In the beginning, there was PERN. This was very easy guide to the basic elements of programming for the GBA, but it did contain some factual errors and quite some bad programming practices and style. Nearly all GBA programming guides that followed inherited some or all of its problems. This includes your book, which fortunately has done away
with some of the problems, but unfortunately has added a few new ones as well.
People have been basing their code on these guides and also adopted their bad practices. Dozens, perhaps hundreds of threads on the gbadev forum are related to these issues. Many people, me included, have been offering solutions to these problems; problems that would have never come up if people (especially the writers) had simply followed standard programming practices in the first place.
Part of the reason I wrote Tonc was to warn people about the problems found in other tutorials, how to do things properly and why. This page is simply an extension of that. I could have written one for some of the other guides as well, but since the others don't have the presence that your book has (with good reason: it is superior in layout, presentation and coverage), I find it the only source that's worth doing it for.
[br]
I know I sometimes sound … irritated in the text. This is because some of the mistakes simply confound me. For example, there are several different methods of vertical synching in the book and they're all wrong! That a well-willing amateur makes such a mistake is bad enough, but to see a professional writer and computer science teacher do this just boggles my mind.
The only reason I can think of for this problem is that they were simply copied blindly from other tutorials without checking if they worked. You can't expect a student to catch something like this when the teacher has missed it repeatedly. This just illustrates the need for a document like this one.
[br]
Yes, I know. That's part of the reason this page exists: to try to prevent erroneous information from being passed on to the students. For the record, just as there are people for whom it's been helpful, there are other who would like it to go away. See here, for example. I am not alone in this.
[br]
Please point out anything that I've taken from the book other than for the purpose of this document.
[br]
In a manner of speaking, I have.
I know very well how hard it can be, and how it feels when someone speaks badly about it. But just because it is hard work doesn't mean it can't be criticized. In fact, when it comes to a textbook, I think that flaws should be pointed out and if possible fixed. Students should be able to trust a textbook to give them accurate information and correct procedures. Feel free to disagree, but I believe it is the responsibility of an author to check the book's contents and to fix or at least point out any problems that are were missed at its release. If he does not, then somebody else might do it for him.
To Jonathan:
Instead of being so defensive, why don't you just fix the issues in your book? You are a professional programming author, so surely you care about correctness? In this case, the reason the GBA programming community recommends against your book, and highly recommends Cearn's TONC is because his is correct, and yours is not. We all make mistakes, and graciously accepting feedback and correction shows that you care about the quality of your work, and improves both your own knowledge, and allows the entire community to benefit. Cearn has corrected me multiple times, and I have learned and benefited from it.
Fighting back in the manner you have done accomplishes nothing for yourself or the community -- other than turning people away from the rest of your books. Telling Cearn that he is "Nothing but a taker" is but a sophomoric jab, as he has spent a lot of time and energy providing the GBA community with the best tutorial that exists.
So if you'd like us to take your book (or any of your other published books) seriously, put down your ego, learn from others, and everybody wins.
Ok,
So I'm glad I came across this and will read over it fully someday I imagine, that said I can understand some of John's annoyances, I mean first off I learnt programming the GBA fully off this book as much as I probably ever will and maybe his code was inefficient but it's introduced me to programming at a low level very well and I understand the principles which is all that was really asked for us on my course. I am no English boff but I believe professional writing even annotation's shouldn't reflect the author's emotions towards the subject, also perhaps you should put more thought into why he's done some of the things he's done.
i.e.: Real C programmers know how to use compound assignment operators like &= and |=. Not using them makes the code longer, which is generally more difficult to parse, which makes it easier to make errors.
Ok, what you say is true, no questions there but I remember a time when I wouldn't have understood compound assignment operators easily, perhaps John was aiming to make it GBA Programming for Dummies in this case. Assuming that he just didn't know of the possibility is rather short sighted on your part, like I say I haven't read the document in full and I may be jumping to conclusions that the rest of it is like this.
What both you AND John need to remember was his book was a first edition, some mistakes are expected, especially in programming books where the code is probably not being edited sufficiently, still John acknowledged it was a First Edition and has openly stated he doesn't intend to work on a full 2nd edition preferring to add bit's he never had the opportunity to write, so he can't get upset if someone wants to correct the mistakes he's made.
I hope I haven't offended anyone and I intend to email this site to my course tutor with the suggestion he includes it for next year.
Regards,
Brian
I had forgotten about this web site until another college professor mentioned it again. (Sigh). I acknowledge all of the points made here by visitors since my last post a couple years ago. What almost no one has bothered to notice, however, is that my ill-fated GBA book was written in early 2003, while this web site is dated 2007-08. My irritation in the first post was based on this web site author's presumption to correct me on scores of issues FIVE YEARS after the fact. Fine, so he knows a lot more about the GBA than I do, FIVE YEARS LATER. I used the material to teach a course for about three years myself, and came up with a whole ton of new code, a graphics library, scrolling examples, a fully-working heap memory manager, image tools, and C++ sprite classes to manage animation. None of this new code was squeezed back into the old e-book since I didn't have the time or desire to write any more on this topic.
When I started writing, back in late 2002 and into spring 2003, there was next to nothing available online about the GBA, besides a lot of very difficult hardware guides and a few very confusing examples on the fan sites, filled with extremely low-level hardware code, almost no middle or high level function libraries whatsoever... literally, every example I came upon was filled with hard-coded hardware registers and bit-shifting. Taking those electrical engineering-level examples and trying to make legible examples out of them, transcribing many of the registers into #defines, and offering functions, was not without problems. (All eloquently pointed out here on this web site with much sarcasm). I was working with André Lamothe and Joe Grand, who both seemed to find my code interesting. A quick summary of the GBA can be found in chapter 5 of Grand's book "Game Console Hacking".
I was offended that this web site author had come along many years later to criticize something when he clearly was already a quite advanced GBA programmer who did not need such a book. But I don't mind nearly as much as my original post conveys. So many years have passed that its hardly worth discussing now... surely everyone has moved on to the NDS or PSP.
For the record, my first reaction was a bit strong, but it's no big deal. I'm glad people have found this web site and learned more about the GBA from it. If my very early work with the GBA helped anyone, and led to some excellent tutorials with only good programming practices in them, then I'm glad and it was worth the effort.
Have you ever considered creating an ebook or guest authoring on other blogs?
I have a blog centered on the same topics you discuss and would love to have you share some stories/information.
I know my visitors would enjoy your work. If you're even remotely interested, feel free to shoot me an e-mail.
"My irritation in the first post was based on this web site author's presumption to correct me on scores of issues FIVE YEARS after the fact." - jsharbour
in fact it does not matter how old a book is. it does matter wether the information in it is wrong no matter how old it is. its more important to have a valid resource with trustable information. and since the book was re-released in 2012 the post might still be of interest for some buyers. in fact its the task of a serious author to provide errata. other authors do that as well.
"So many years have passed that its hardly worth discussing now... surely everyone has moved on to the NDS or PSP." - jsharbour
it is still funny to read these comments. in fact the book was re-released as a kindle ebook in 2012. if the advices have not been integrated into the new book that would be a pretty bad sign for a college professor who teaches this subject. if so the post is up to date again and not "FIVE YEARS after the fact" anymore. :)
coronac: good work btw, well done.
before i forget it:
"I was working with Andr� Lamothe and Joe Grand, who both seemed to find my code interesting."
Lamothe is NOT an indicator for good books, more or less the opposite. I earn enough Lamothe books to say that. Just read the customer reviews on amazon. Everybody knows that 50 percent of the content in his books is bad c source code. Good authors would put source on a cd and write about algorithms. He uses source to fill books. I still wonder why people think he is a game programming guru just because he is writing the foreword of most of the badly written books on game programming by amateur authors!?
Wow, this has been a very interesting time-machine-experience for me. I'm about to set up a project to finally learn how to make GBA games and I find it very exciting! I would like to thank both Jasper Vijn and Jonathan Harbour for having spent a lot of your time and effort into teaching the rest of the world about GBA development. I want both of you to realise how important you have been for the establishment of the GBA development documentation and you should both be very proud of yourselves. THANK YOU SO MUCH!
Time to start making games, yeah! I owe it all to you guys! :)
Would be nice to be able to repay you guys in some way at some point, maybe when my game is done you can have a copy of it. Sharing is caring, happy days! :D
I'm a 15 year old aspiring game Dev and J.S. harbour's book definitely made me have some bad practices in GBA development. Thankfully I found Tonc by the chapter where Harbour had us doing the button tests lol. I'm just happy I found this early on or I would have been subjected to even worse programming practice and false/incorrect information regarding GBA development xD however, I do thank you both (Coranac and Harbour) for releasing these wonderful materials and allowing me to learn from them and their mistakes (emphasis on mistakes for Harbour haha xD)
This page came up again after so many years. I don't remember writing those two earlier posts, as it was so long ago--wish I hadn't, but back then I had a hard time with hate mail--the same sort you see on Youtube. Over a free e-book--can you believe some people? Anyway, this was about 8-12 years ago when it was relevant. Wow! This was a lot of fun back in the day, though.
I wanted to reply to aeryck and Kaleb since I didn't see your posts. Glad you enjoyed it too! Fun times.
There were about a dozen stories at the time about guys who were hired after putting together a GBA or Dolphin SDK demo for their portfolios--EA, 2K, Vicarious, when I was teaching at UAT. Just having a head start with the GBA architecture was enough, not necessarily experience with the legit tools. During the peak, 2008-ish, I was working with a colleague from LucasArts on a DS version of Memoir 44 (Days of Wonder). What a crazy idea! The DS was a lot tougher than the GBA, so all we really had available was a video buffer, no hardware. The SDK was similar to the Dolphin, nothing like the GBA.
p.s. He (DW) is with an indie studio now called Impeller, and they've got a sci-fi game on KS called STARFIGHTER INC. Check out the team, some cool guys behind this game, some good industry experience. (expires 6/06/15). https://www.kickstarter.com/projects/impellerstudios/starfighter-inc/
jsharbour brings shame on his entire family
Hello admin, i've been reading your website for some time
and I really like coming back here. I can see that you probably don't make money on your
website. I know one awesome method of earning money, I think you will like
it. Search google for: dracko's tricks
á”OW just what I wwas searching for. Came heгe bby searching for alu foldeknive
Howdy
SEO Link building is a process that requires a lot of time fo coranac.com
If you aren't using SEO software then you will know the amount of work load involved in creating accounts, confirming emails and submitting your contents to thousands of websites in proper time and completely automated.
With THIS SOFTWARE the link submission process will be the easiest task and completely automated, you will be able to build unlimited number of links and increase traffic to your websites which will lead to a higher number of customers and much more sales for you.
With the best user interface ever, you just need to have simple software knowledge and you will easily be able to make your own SEO link building campaigns.
The best SEO software you will ever own, and we can confidently say that there is no other software on the market that can compete with such intelligent and fully automatic features.
The friendly user interface, smart tools and the simplicity of the tasks are making THIS SOFTWARE the best tool on the market.
IF YOU'RE INTERESTED, CONTACT ME ==> seosubmitter@mail.com
Regards,
Estelle