The Annotated 'Programming the Nintendo GameBoy Advance: The Unofficial Guide'
- 1 Chapter 1: The Zen of Getting Started
- 2 Chapter 2: GameBoy Architecture in a Nutshell
- 3 Chapter3 : Game Boy Development Tools
- 4 Chapter 4 : Starting With the Basics
- 5 Being One with the Pixel (pp 111)
- 6 Chapter 6: Tile-based Video Modes
- 7 Chapter 7: Rounding up Sprites
- 8 Chapter 8: Using Interrupts
- 9 Chapter 9: The Sound System
- 10 Chapter 10: Interfacing with the Buttons
- 11 Chapter 11: ARM7 Assembly Language Primer
- 12 Conclusions
Introduction
“Programming the Nintendo GameBoy Advance: The Unofficial Guide” by J. Harbour is an e-book on GBA programming covering some of the basic and not-so basic items involved in GBA programming. It's a clear, easy to understand and professional-looking book; one that is used by many people taking their first steps into GBA and if I recall correctly several colleges.
That said, there are also a number of bad aspects of it, namely its content. Or at least, the quality of large parts of its content. Yes, it does teach the basics of GBA programming, but it also:
- Focuses on the wrong basics (bitmap modes, which are hardly ever used in commercial games).
- Uses inconsistent nomenclature in code and other bad programming practices.
- Contains code that is simply incorrect (does not compile or worse).
- Gives incorrect advice and information.
- Uses very, very inefficient code without significantly increasing legibility.
All these items hinder long-term understanding of GBA programming. That incorrect code and information will cause problems is obvious, but things like inconsistent code and other bad programming methods are potentially even worse, because the badness is hidden there.
For example, consider code that is inefficient or is just a mess. Yes, it might work, but it could work a lot faster if you do things a little differently (and this difference can be as small as changing a datatype) and it can be hell to maintain later on. Another example is inconsistent terminology. Things should follow the Principle of least astonishment, which you can achieve by consistent design and nomenclature. The problem with inconsistencies is that you never know where you stand; it's a sign of sloppy thinking. Inconsistencies make it hard for novices to see the rationales (if there are any to begin with) behind things and the only result this can have is a flawed understanding of the subject. Code will become ‘magic’: an incantation to utter to get something done. Code is not magic and you have to actually think about things before acting. This is part of the difference between a good programmer and a bad programmer.
Because of these flaws, readers of the book are likely to get stuck at some point and will have to unlearn and relearn how to do things properly before they can really continue. Yes, I know how that sounds, but we've seen many, many instances of that on the forum over the last few years. Tonc has been one of my efforts to weed them out; this is another.
Most of the annotations here aren't actually about GBA programming specifically, but general principles of programming like consistent terminology, efficiency, and basic programming principles. Nearly every demo in the book suffers from these problems. Because later demos often repeat the same problems and I don't want to repeat myself too often, I'll only covers a specific problem once.
I think I've been fairly thorough with this text. In some cases, perhaps a little too thorough. Some of the points mentioned here will be Really Important, while others are nearly too trivial to mention. I mention the latter anyway because I do consider them to be worth thinking about at least once. This is still somewhat of a first draft, but it's a pretty complete draft, and should be a useful reference as it is now. Barring a few layout issues and typos, I don't think the text will need to be altered that much.
Now, writing something like this should have been the author's job. When someone else does it, especially ‘the competition’, it might come off as a little mean-spirited. Well tough. If someone leads people into blind alleys, putting up signs indicating them as such is a good thing, not a bad thing.
1 Chapter 1: The Zen of Getting Started
No annotations (there's little to annotate about introductions anyway)
2 Chapter 2: GameBoy Architecture in a Nutshell
Incorrect info: Memory architecture (pp 54-59)
The term ‘Access Width’ that's used here is a little ambiguous. For example, table 2.2 lists the access widths of the memory sections as 8 or 16 bits, except for IWRAM, where it says 32 bits as well. The problem here is that there is a difference between bus width and (software) access width. Table 1 lists the information according to GBATek. The bus-size may be 16bit in some cases, but this does not mean the access is restricted to that size; it would just take a little longer.
For example, it simply isn't true that VRAM accesses (pp 58) are restricted to halfwords: words work as well, only it'd take a cycle longer. However, when filling regions of memory you have to loop over a given size anyway, and then word-accesses will actually be quicker because there are only half the loop iterations in that case. If you do it right, you can get a performance gain of up to 8 times.
| Region | Bus | Read | Write | Cycle |
|---|---|---|---|---|
| BIOS ROM | 32 | 8/16/32 | - | 1/1/1 |
| Work RAM 32K | 32 | 8/16/32 | 8/16/32 | 1/1/1 |
| I/O | 32 | 8/16/32 | 8/16/32 | 1/1/1 |
| OAM | 32 | 8/16/32 | 16/32 | 1/1/1 |
| Work RAM 256K | 16 | 8/16/32 | 8/16/32 | 3/3/6 |
| Palette RAM | 16 | 8/16/32 | 16/32 | 1/1/2 |
| VRAM | 16 | 8/16/32 | 16/32 | 1/1/2 |
| GamePak ROM | 16 | 8/16/32 | - | 5/5/8 |
| GamePak Flash | 16 | 8/16/32 | 16/32 | 5/5/8 |
| GamePak SRAM | 8 | 8 | 8 | 5 |
Incorrect info: ARM vs Thumb code (pp 60).
There is a performance benefit to using 16-bit instructions in a computer with 16-bit memory, although I don't get into Thumb code at all in this book, as it is an advanced topic. True, there may be cases for using Thumb instructions to speed up parts of a game, but there are also cases for using regular ARM instructions as well.
Up to the first comma the quote is correct; after that it gets murky. Yes, Thumb instructions are better suited for 16-bit memories, which is why Thumb instructions are advised for GBA programming. There is no “… may be cases …”: the brunt of the code should be Thumb leaving ARM instructions for optimized code, not the other way around. It's not difficult to do so either: use the -mthumb compiler flag, together with -mthumb-interwork.
3 Chapter3 : Game Boy Development Tools
HAM stuff. Can't really comment on that.
4 Chapter 4 : Starting With the Basics
4.1 REG_DISPCNT (pp 105-106)
Nomenclature / classification.
It is better that #defines for register bits use a prefix of some sort that indicates where they belong. Names such as MODE_3 and BG2_ENABLE don't do that, which can become problematic when other things have modes and backgrounds as well.
Volatile registers.
All IO registers should be defined as volatile. Because this would make
the definitions long (volatile unsigned long, etc), use the common
shorthands like u32, vu32, etc.
4.2 Fillscreen demo: inefficient code / bad coding (pp 106-108)
//# From the fill screen demo. ... //create a pointer to the video buffer unsigned short* videoBuffer = (unsigned short*)0x6000000; ... //macro to pack an RGB color into 16 bits #define RGB(r,g,b) (unsigned short)(r + (g << 5) + (b << 10)) ... void DrawPixel3(int x, int y, unsigned short c) { videoBuffer[y * 240 + x] = c; } ... //fill the screen for (x = 0; x < 239; x++) { for (y = 0; y < 159; y++) { DrawPixel3(x, y, RGB(0,(255-y),x)); } }
The following quote has four excepts from the FillScreen project that are problematic.
Variable for videoBuffer
Creating pointer variables for memory accesses is usually a bad way of doing things. First, it takes away IWRAM space. Not much, I'll admit, but if you do this for every section/register, you're already up to half a kB at least. More importantly, though, is that it's dangerous. The pointer can be redirected at any time, which can be be bad because the addresses are static. This can also be a cause of slow-down, because global variables need to be reloaded after every function because that function may have modified the variable.
The general method for hardware access is a #define. In this case:
#define videoBuffer ((u16*)0x06000000)
Note the double parentheses here: one for the cast and one for the
whole thing. The second one is necessary because array accesses have a
higher operator priority than casts: (u16*)0x06000000[1]
would be parsed as (u16*)(0x06000000[1]), which won't
even be compiled. This is probably why the variable is used in the
first place.
That said, in the case of videoBuffer, using a variable might not be such a bad idea because it should be redirected for page flipping. But that's for modes 4 and 5, not mode 3.
RGB Macro
Preprocessor macros are generally considered evil, because there are
so many potential pitfalls there. One of them is that parameters can
be expanded in funny ways because the work by direct text substitution.
For example `#define SQUARE(x) x*x' with x = 1+2
will give 1+2*1+2 = 5, and not 9. This is why every book that
covers C gives the advice to always, always put parentheses
around the macro arguments. Because it's not done here, the call to
RGB() later on requires the parentheses –
something which one can't expect the users to know – and
they'd have a right to complain when their code goes wonky.
This kind of code is one of the reasons why most books will urge you to use inline functions instead of macros. Use macros only when you really have to.
The macro is also somewhat unsafe because it doesn't validate the
arguments. The color components have ranges between 0 and 31;
go out of the range and one color will bleed into another. But
people are wise enough not to do that, right? Usually, yes, but
in this particular program this is exactly what happens:
the color used is RGB(0, 255-y, x) meaning that
the green and blue component are generally outside the safe
range here. Yes, it doesn't matter much here because a funky
pattern is kinda what the author is after, but it'd be nice if
unsafe uses like this had been mentioned.
Function for drawing pixels
There are three kinds of procedures in C: macros, functions and inline
functions. Each of these has pros and cons. The important con of
regular functions is the overhead cost for calling them:
setting up the parameters, 2 jumps between the caller and called
function, and potentially a few others as well. In contrast, macros
and inline functions are integrated into the caller so there's no
overhead. Additionally, because functions are separate entities you
(or the compiler) can't optimize their calls much. Like, say, loading
videoBuffer once and using it with offsets if you're in
a loop. No, you have to reload it each time.
For most functions, the overhead of a function will be small compared
to its body. However, the body of DrawPixel3() is one
line. In most cases, the overhead here will cost you a factor 3!
Pixel plotters should be macros or inlines. Always.
Screen fill loop
In C, the x-coordinates of a matrix (such as a bitmap) are
contingent in memory. Because of that, it is better to put the
x-loop inside the y-loop. That way, the code will be
optimized for using consecutive memory. Of course, it doesn't matter here
because DrawPixel3() doesn't allow optimizing, but when
using a proper mode 3 plotter, the effects should be noticeable. Also,
the boundaries are usually the screen dimensions, not one less.
This gives use the following improvements:
#define videoBuffer ((unsigned short*)0x6000000)
// Macro to pack an RGB color into 16 bits
#define RGB(r, g, b) (unsigned short)((r) + ((g) << 5) + ((b) << 10))
static inline void DrawPixel3(int x, int y, unsigned short c)
{
videoBuffer[y * 240 + x] = c;
}
...
// Fill the screen
for (y = 0; y < 160; y++)
{
for (x = 0; x < 240; x++)
{
DrawPixel3(x, y, RGB(0, (255-y), x)); // NOTE: still unsafe, but meh.
}
}
None of these items are particularly difficult to do correctly and none of them complicate the code. Both versions will work fine, but the original code used bad programming practices and inefficient code. In trivial projects this doesn't matter, but once the projects get large enough they will become problematic and you might have to start all over again to get rid of the bad practices.
This is why people object to this book and many of the other tutorials: they're riddled with things like these. Newbie programmers will come into them and adopt their poor programming standards and only later find out that the practices the use are in fact bad. But by then it may already be too late to do anything about them :\.
4.3 ButtonTest project (pp 111)
Bad coding: copy/pasta (pp 111).
ham_DrawText(3, y, buttonstring);
and
if (F_CTRLINPUT_button_PRESSED) ham_DrawText(0, y, "X"); else ham_DrawText(0, y, " ");
This is probably a matter of taste, but if you have a list of nearly identical statements or code blocks it's probably better to use loops and look-up tables. It's usually easier to read and maintain. Copy&Paste code is bad, m'kay?
5 Being One with the Pixel (pp 111)
5.1 Mode3Pixels (pp 133-135)
//# Parts from Mode3Pixels's main.c //declare the function prototype void DrawPixel3(int, int, unsigned short); ... //changes the video mode #define SetMode(mode) REG_DISPCNT = (mode) ... //packs three values into a 15-bit color #define RGB(r,g,b) ((r)+(g<<5)+(b<<10)) ... x = rand() % 240;
Prototype formulation
A function prototype is a way to tell the compiler that a function of a certain name exists, what the number and types its arguments are, and what type it returns. The prototype given here does just that. However, it should also serve as a summary for the programmer. Because the declaration here doesn't mention the names of the arguments, it's much less useful than it could be.
Consistency: RGB() macro
Note that the r term now has parentheses. However,
the others do not, so it's still unsafe to use.
Inefficient ranged randoms
While the use of `rand() % N' for a random number between
0 and N is a time-honoured tradition, it doesn't really work well for the
GBA because both rand() and the modulo use division,
which are slow operations on the GBA. A better solution would be to
use `rand()*N>>15', which may not be as obvious,
but it works just as well (better actually, because the higher bits
tend to be more random) and is much, much faster. It makes use of the
fact that rand() returns numbers in the range
[0, 215〉, which can be interpreted as a Q15
fixed-point number between 0 and 1. For even faster randoms, replace
rand() with your own liner congruent RNG that doesn't use
divisions. You can go from 1200 cycles to about 90 that way.
(tonc:qran)
5.2 Mode3Lines, Mode3Circles, Mode3Boxes (pp 147-149)
As I said before, using DrawPixel3() as a function
instead of a macro or inline means losing a factor 3 in speed.
It gets even worse for DrawBox3(). The 3x is the standard
function overhead, but in the case of drawing rectangles, there's
potential for optimization because you're accessing adjacent pixels.
This means the speed loss is more like 3.5 instead of 3. And then
there are simple optimisations that aren't taken here. Filling in
words instead of halfwords would double the speed. Using a fast filler
(DMA/memset16) would increase it even more. Much more.
5.3
Mode3Bitmap (pp 152-154)
Bad coding: #including data
If there's one bad practice the experienced GBA homebrewers would like to see go away, it's this.
#include "mode3.raw.c"
The reasons why this is bad can be found at tonc:header. The author mentions this is technically a bad practice and that the proper method is separate compiling&linking, but that it's alright for small projects (at least I think he does; I just can't find the reference anymore). This is true, but the fact is that the many of the homebrewers and student don't know the proper procedure yet, and will adopt this as standard, which will then have to be unlearned later.
5.4 Palette discussion
mem variable: PaletteMem (pp 156)
Should be a #define.
Bad info: transparent color
The first index of the palette is not usually set to 0 (black) but to a bright, easily distinguishable color, so that it's easy to see what should be transparent and what shouldn't. There's a reason that movies and TV use blue-screen or green-screen for CGI, and not black-screen.
Incorrect info: sprite palettes (pp 156)
‘Remember, even the sprites in your game uses this palette …”
No, sprites have their own palette.
Incorrect info: DMA (pp 158)
“However, mode 4 has an advantage of being fast when using hardware-accelerated blitting functions and DMA, because twice as many pixels can be copied to video memory.”
Technically this is true, but it's misleading. The other modes can use DMA just as well. It's not really that twice as many pixels can be copied, but that mode 4 simply uses half the number of bytes. But even with DMA and 2× fewer bytes to copy, mode 4 is simply too slow for general games; a fact not mentioned anywhere.
5.5 Mode4Pixels (pp 159-161)
Code inconsistency : RGB() macro
Now all terms are properly parenthesized. If it's corrected here, why's it not done in other projects?
mem variable: paletteMem
paletteMem should be `#define paletteMem
((u16*)0x05000000). Also note the inconsistent naming for
memory: paletteMem and videoBuffer. We'll
see more of this later.
Bad coding: loop variable names
Loop variables are named i, j, k
and such (or doubled: ii, jj, kk).
n and m are used for sizes, not indices. Though
I'm willing to admit I prefer it that way my background is physics
and not of computer science.
Inconsistent use of functions
Why are we using memcpy() for the bitmap, but not the palette?
memcpy usage (pp 164)
“The memcpy function copies a specified number of bytes from a source buffer to a destination buffer. It works great for displaying a bitmap in mode 4!”
Yes it does. It would also work great for the other modes.
“Memcpy may not be the fastest method, because it doesn't take advantage of writing two pixels at a time …”
It's not the fastest, true, but it does write 2 pixels at a time. Four even. In fact, that's the only reason it works: if it did copy byte for byte it wouldn't work because of the no-byte-write rule for VRAM. However, it will only work when the size and alignment are right. (tonc:memcpy)
“However, it may be faster to copy mode4_Bitmap to the video buffer using a loop that iterates 160x120 times …”
No it won't, because memcpy() already takes the faster route:
struct-copy by 16 bytes in one go.
“… (that's half the number of pixels)”
Yes and no. Yes, it's half the number pixels, but that's not what counts. The real point of using 160x120 would be because that's the number of halfwords you need to copy; that this amounts to half the number of Mode 4 pixels is besides the point: it's the number of chunks that matters. Even better would be to copy in words of multi-word structs.
Once again: the number of pixels is irrelevant, it's the amount of data that's important. ‘Pixels’ are just an interpretation of data. For more in data, see (tonc:data)
“…but I'm not going into optimization at this point because DMA is faster than any software loop.”
But not by much. In copying, it's only 10% faster than a well-written copier. For filling, it's actually 10% slower!
5.6 Page flipping (pp 164-165)
Incorrect information: double buffering speed
“Basically, when you draw everything to an off-screen buffer, things move along much more quickly. In fact, using a double buffer makes the drawing operations so fast that interferes with the vertical refresh, so you much add code to check for the vertical blank period and only flip in this period.”
No. The speed of drawing is the same, you just don't see it happen. Page flipping is the solution to interference with the VBlank, not the cause. If you don't wait for the VBlank, tearing etc will happen, only with page flipping you can prepare a scene off-screen (even in the VDraw period) and only display it when it's ready. It is true that waiting till the VBlank for the flip is a good idea, though.
Mem variables/inconsistent nomenclature: FrontBuffer/BackBuffer
Again, use #defines for FrontBuffer and
BackBuffer. And more inconsistent naming for memory;
this time it's initial caps instead of camelcase. The
BACKBUFFER #define is problematic as well, as it refers
to the bit in REG_DISPCNT instead of the actual backbuffer.
5.7 Mode4Flip (pp 165-170)
Oh my, where to start.
volatile unsigned short* ScanlineCounter = (volatile unsigned short*)0x4000006; ... paletteMem = RGB(0, 31, 0); ... //slow it down n = 500000; while(n--); ... void WaitVBlank(void) { while(*ScanlineCounter < 160); }
ScanlineCounter and WaitVBlank
First, as explained a number of times now, use #defines for memory
mappings: it's generally faster and less memory intensive – no sense
in wasting memory if there's absolutely no reason for it. Second,
0400:0006 has a perfectly valid name already:
REG_VCOUNT. Third, notice the volatile here.
It's definitely necessary because the register changes without the
programmer's interference. However, there's no mention of this crucial
bit of information anywhere in the text.
As for WaitVBlank; technically, it's correct: it waits till the VBlank.
However, what's needed for proper animation is waiting for the next
VBlank. If we're already in the VBlank, we pass through this function
immediately. That is why there's that wait-loop inside the main loop,
to wait till we get out of the VBlank again. The correct procedure
would be to check REG_VCOUNT
(tonc:vsync).
Also, note the dereference of ScanlineCounter. This is required because
it's a variable; with a #define, you could build the deref right into it.
Incorrect code: paletteMem redirection
And this is one of the reasons why memory mapping should be done with
#defines instead of pointer variables: pointers can be redirected,
like it's done here. What it should have said here is
`paletteMem[1]', but as it stands, it makes
paletteMem point to address 0000:03E0. If it
was a #define, the compiler would have complained.
Bad coding: Slow-down loop
The only reason this is ‘necessary’ is because
WaitVBlank() is flawed. In fact, with a proper compiler
this wouldn't even work because the optimizer should recognize it as a
time-wasting empty loop and throw it away.
Inefficient code: DrawBox4()
I admit that the function is comprehensible, but it is so ridiculously
slow that it should not be used in an actual project. Ever. Even with
a moderate amount of effort, you can get a function that's more than
30× faster. Remember: DrawPixel4() checks for even
or odd pixels, but if you're working on a stretch of pixels that's
completely unnecessary because both will be painted on.
5.8 Mode5Pixels (pp 172)
Inconsistent/unsafe code: RGB() macro
And now the unsafe version is back! O_o.
5.9 Text (pp 173-)
Misc dumbness: font system
“Now, I realize that many programmers use a bitmapped font, and that's not a bad idea at all, because the font characters can be treated as sprites. However, I prefer a low-memory footprint and more control over the font display mechanism.”
At which time the book proceeds to create a bitmapped font with a high memory footprint, with few control features. Nice.
Inefficiency: the font
The font used consists of a partial font: ASCII 32 to 96 (some punctuation, numbers and uppercase letters). This is represented by an array of 64×8×8 halfwords. Each halfword can be 0 or 1. This not only leaves out lowercaps and some of the punctuation symbols, but also wastes 15/16 of the space by using only one bit of each halfword. It should have at least been bytes, and at best a 1bpp bitpacked font (tonc:text). As a bonus, the whole 8kiB font (which generally is constant data) is placed in IWRAM!! Losing a quarter of very precious memory for something that's 16× bigger than it's supposed to be it just plain bad.
Inefficiency: DrawChar
Aside from the slowness of DrawPixel3() (which has been
covered already) the extra math in assigning draw will probably take
some cycles as well. Pointer arithmetic would be of great assistance
here both in performance and clarity. Yes, clarity too: pointers have
a way of reducing the amount of written code because all the constant
or nearly constants offsets are integrated into the pointer, rather
than added to the code for every single step.
void DrawChar(int left, int top, char letter, unsigned short color) { int x, y; int draw; for(y = 0; y < 8; y++) for (x = 0; x < 8; x++) { // grab a pixel from the font char draw = font[(letter-32) * 64 + y * 8 + x]; // if pixel = 1, then draw it if (draw) DrawPixel3(left + x, top + y, color); } }
My own mode 3 text writer uses about 1.35k cycles/char. This is a
lot, but it's manageable. This DrawChar() does 5k cycles/char.
But wait! That's with the font in fast RAM. If you put it in ROM
where it's supposed to be, the time jumps to 7k/char! That means that
you're out of the VBlank in about 11 characters so you could just manage
to print, say, lives left and the score.
6 Chapter 6: Tile-based Video Modes
Incorrect info: parallax scrolling (pp 203).
“Mode 0 is great for this because all four backgrounds are hardware rendered. You do not need to write your own parallax scrolling routine.”
Yes and no. Yes, mode 0 is great, but you still need to write your own parallax scrolling routine because the scrolling registers only do scrolling, not parallax. Aside from that, for big maps you'd still need to write your own scrolling engine anyway, but it will be easier and faster than with bitmap modes, because you'd only have to update one row or column of the screenblocks, rather than the whole of VRAM.
Incorrect info: Tile data and Tile map (pp 204)
‘The tile data itself can be stored anywhere in VRAM, as long as it's on a 16 Kb boundary …”
It doesn't need to be aligned to the charblock boundary; it just makes it easier to use, that's all.
“… video memory is divided into 4 logical Char Base Blocks, which are made up of 32 smaller Screen Base Blocks …”
(pp 204-205).
No. Well, yes, but a misleading yes. Screenblocks have nothing to do with charblocks; the two are completely separate entities. That they happen to use the same memory addresses is an inconvenience to watch out for, not something hierarchical.
“The tile map (which defines where the tiles are positioned) must begin at screen base 31 at the very end of video memory.”
This is nonsense. Probably due to sentence fragmentation (I know how easily those can happen), but it's still nonsense. The tile map doesn't define where the tiles are positioned – it defined which tiles are positioned where. There is a subtle difference in these phrases. And the tile maps can be at any screen block, not just at 31.
Incorrect info: DMA Blitting (pp 209)
“Let's not forget that DMA is a hardware process, where a software blitter is compiled and run by the CPU as machine instructions. You can't begin to compare a hardware process with a software process, because anything that is hard-coded into silicon will blow away a series of machine instructions. … you could write a fast memory copy routine in assembler and it would be much faster than a C routine. However, DMA will blow them both away …”
No, we can compare them; yes, we can write our own asm copier; and, no, DMA will not blow them both away. The reason that DMA is fast is because there is no loop overhead. However, with ldmia/stmia instructions, the loop overhead per byte is significantly reduced anyway, especially for sections with high waitstates. In most cases, DMA will only be 10-20% faster in copying. It will actually be slower when it comes to fills, because it will re-read the data to fill with every time, which a software version won't have to.
Bad coding: DMAFastCopy
The checks for DMA16_NOW and DMA32_NOW
needlessly restrict the function in its use. It would be better to
make it into a general DMA3 copier and leave them out (and change the
name to DMA3Copy() too, of course). Additionally, it'd be
better to make the type of DMA source and destination registers into
(const) void-pointers, so that you don't have to cast them all the
time. Even better, cast each channel into
DMA_RECs.
“… I'm simply adding in this small level of error checking to keep the function from overwriting memory somewhere if an invalid option is passed into it.”
Except that it doesn't, because checks for overwrites aren't really covered by the control options but by the destination address and size.
6.1 TileMode0 (pp 211-215)
void DMAFastCopy(void*, void*, unsigned int, unsigned int); //copy the palette into the background palette memory DMAFastCopy((void*)test_Palette, (void*)BGPaletteMem, 256, DMA_16NOW); ... #define BG_COLOR256 0x80 #define CHAR_SHIFT 2 #define SCREEN_SHIFT 8 #define WRAPAROUND 0x1 //background mode identifiers #define BG0_ENABLE 0x100 #define BG1_ENABLE 0x200 #define BG2_ENABLE 0x400 #define BG3_ENABLE 0x800 ... //background memory offset macros #define CharBaseBlock(n) (((n)*0x4000)+0x6000000) #define ScreenBaseBlock(n) (((n)*0x800)+0x6000000) //create a pointer to background 0 tilemap buffer unsigned short* bg0map =(unsigned short*)ScreenBaseBlock(31); ...
#define BGPaletteMem ((unsigned short*)0x5000000) ... //vertical refresh register #define REG_DISPSTAT *(volatile unsigned short*)0x4000004 //wait for vertical refresh void WaitVBlank(void) { while((REG_DISPSTAT & 1)); } ... #define BUTTONS (*(volatile unsigned int*)0x04000130) if(!(BUTTONS & BUTTON_LEFT)) x--;
Partially improper types: DMAFastCopy()
Part of choosing the type of a symbol is making it easy to use.
For general memory, the most useful types are `void*' or
`const void*', because then you won't have to force the
user to cast everything and making the code less readable in the
process. In the case of DMAFastCopy(), the first argument
should have been `const void *', because source data
shouldn't be modified. The destination should be `void*'.
Using these types means that the user won't have to cast explicitly. An
other problem with having identical datatypes here (combined with
not having parameter names) is that you could make the mistake of
switching the arguments and so try to copy from destination to source,
which is generally a bad idea. This is more likely than you might
think, since in the standard method of copying,
memcpy(), the destination address goes first.
Note that the destination is actually already the correct type, but the author casts anyway, needlessly, because he tends to copy/paste from other sources. This causes clutter in the code, making things more difficult to read.
Naming inconsistency: REG_BGxCNT defines
BG_COLOR_256, CHAR_SHIFT,
SCREEN_SHIFT and WRAPAROUND are all part of
REG_BGxCNT. However, apart from the first name, there is
nothing that indicates this. The reason this is bad is that it can
cause conflicts and/or confusion with other register bits. Bits like
BGx_ENABLE, which aren't part of the background control
at all. This is aggravated by the comment, marking the latter four as
background mode bits.
Incorrect #define: WRAPAROUND
WRAPAROUND is incorrect: it should be 0x2000.
0x1 is a priority setting
(gbatek:bg control). And like many other register #defines, it's impossible to tell what WRAPAROUND really refers to. Mapping systems
and text writers can have wrap-around as well, for example.
Improper types: Char/ScreenBaseBlock
In nearly – and perhaps all – cases, char and screen block addresses are used as pointers, not as raw addresses. It'd make sense to put types inside the macros, rather than forcing the user to add them explicitly.
Consistency: BGPaletteMem
The earlier PaletteMem was a variable, now it's a #define.
It should be a #define, but it'd be nice if there was some
consistency in these things throughout the book.
Incorrect #define/info: REG_DISPSTAT
Well, the definition is correct, but not the comment above it.
0400:0004 is the display status, there is no such thing
as a vertical refresh register. Although there is a bit inside
REG_DISPSTAT that checks the VBlank/Draw status, the
register has other functions as well.
Inconsistency/Incorrect function: WaitVBlank
This is the second version of WaitVBlank(), and again it
does not do what it should. Nor what it says. bit 0 of
REG_DISPSTAT checks for VBlank status: if 1, then we're
inside the VBlank
(gbatek:dispstat).
So unlike what the function name indicates, it actually waits till the
VBlank is over, not until it starts. And as a timing mechanism,
it's still useless for reasons described earlier.
Inconsistency/Incorrect #define: BUTTONS
The proper name is REG_KEYINPUT or REG_KEYS
or something else that starts with REG_. Additionally,
REG_KEYINPUT is a vu16, not vu32.
Improper/inefficient code: !(BUTTONS & foo)
REG_KEYINPUT uses active-low settings (which should have
been mentioned here as well somewhere, as it's not exactly obvious).
In other words, the bit is clear when a button is down. To go to an
active-high setting, one should invert the bits and then mask:
`~BUTTONS & foo'. That says “check if button
foo is down”. The code, `!(BUTTONS & foo)',
checks whether foo is not not down. Technically it's the
same, but the formulation is awkward. Aside from that, a logical NOT
and a bitwise NOT are not the same. Also, in most cases the bitwise
version will be quicker, especially on modern compilers.
6.2 RotMode2 (pp 218 - 224)
Improper types: REG_BG2foo
The affine registers are signed, not unsigned. Technically, it
doesn't really matter here because these things are write-only, but
people will base their code on this, use the unsigned types, and then
see the screen go wonky because a signed 0xFFFF is not an
unsigned 0xFFFF. And yes, the proper term is
‘affine’, not Rotation or Rot/Scale. The registers form
a general 2×2; affine transformation and while rotations are
affine transformation, not every affine transformation is a rotation.
The term is incorrect and misleading. It'd be like calling Americans
“Mexicans”. Yes, Mexicans are Americans, but not all
Americans are Mexicans. (For those wo disagree with the sentence
“Mexicans are Americans”, remember that America is a
continent; the USA is a country. That the inhabitants of the
US are referred to as “Americans” is just another example
of the problem I'm addressing here.
Unexplained terms: RotateBackground
Yes, the function has something to do with rotation and scaling,
but nowhere is it actually explained what the terms are and what
it actually does. The function basically implements
tonc:eq 12.4,
with
α = −ang,
map anchor p0 = (scroll_x,
scroll_y) and
screen anchor q0 = (cx,
cy). Look there
for a more efficient version of this function as well.
The formulation and order of the terms in the equations is …
confused, especially for those who actually know their math. When
it comes to writing down matrix equations explicitly, it is better
to have the first column (the terms multiplied by
x_center) before the second column (the
y_center terms). Aside from that, you do not
use all-caps for variables. You just don't. Ever.
void RotateBackground(int ang, int cx, int cy, int zoom) { center_y = (cy * zoom) >> 8; center_x = (cx * zoom) >> 8; DX = (x_scroll - center_y * SIN[ang] - center_x * COS[ang]); DY = (y_scroll - center_y * COS[ang] + center_x * SIN[ang]); PA = (COS[ang] * zoom) >> 8; PB = (SIN[ang] * zoom) >> 8; PC = (-SIN[ang] * zoom) >> 8; PD = (COS[ang] * zoom) >> 8; }
Inefficiency: sin/cos tables should be power-of-two
Power-of-Two-sized (PoT) arrays for periodic functions make things a lot easier, because you can then simply mask the index to keep from going out of bounds.
WaitVBlank() version 3
And this one could actually be considered correct. Too bad it isn't a
function, but loose parts inside the main loop. The loop start with
a wait-for-vblank, and ends with a wait-till-vdraw, which if
swapped gives the correct implementation of WaitVBlank().
Only in this case, the update code is actually done inside the VDraw
period, thanks to the first `while(REG_DISPSTAT & 1);'.
Oh well.
7 Chapter 7: Rounding up Sprites
Missing info: Converting tiles (pp 231)
gfx2gba can be used for object tiles as well. In fact, it's notably better than pcx2sprite. (Though grit is better then either here, of course :))
7.1 SimpleSprite (pp 236 - 241)
Terminology: SpriteMem, SpriteData and SpritePal
While it's good that they're clearly indicated as having something
to do with sprites, There is the potential for confusion between
SpriteMem (i.e, OAM) and SpriteData
(Object VRAM), because Mem and Data are
pretty much synonyms. But at least Object VRAM isn't called
OAMData, which would be even worse. Also, we have now
three different nomenclatures for palettes: paletteMem,
BGPaletteMem and SpritePal. Nice.
Inconsistent/confused terminology: Obj attribute bits
Once again, it is important to indicate where bit defines belong,
which isn't done here. For example, COLOR_256 sets
the color-mode of an object to 256-color, but backgrounds have
a similar option, but at a different bit. Yes, it's called
BG_COLOR256 there, to avoid the naming collision,
but unless you already knew that, you might be tempted
to use COLOR_256 in both cases or that there was
a OBJ_COLOR256. The same goes for the
MODE_x names, which might lead one to think that
they're similar to video mode bits and hence belong to
REG_DISPCNT. Even better, the once that do have an
OBJ prefix actually belong to REG_DISPCNT,
not object attributes.
Improper types: x, y
Local variables should be ints, unless you have a
really good reason to use something else. ARM cores are 32bit,
so use 32bit where you can. For a little more on this, see
tonc:
good/bad practices.
Inefficiency: UpdateSpriteMemory
void UpdateSpriteMemory(void) { int n; unsigned short* temp; temp = (unsigned short*)sprites; for(n = 0; n < 128*4; n++) SpriteMem[n] = temp[n]; }
This is a very bad implementation of an OAM updater. Using
u32 pointers instead of u16* would double
the speed. Using dedicated copiers (like DMA) would increase the
speed by a factor of 5 to 10. Do not use manual copies unless you're
working on small ranges; use memcpy(), DMA or
CpuFastSet()
7.2 BounceSprite header (pp 243 - 245)
Magic numbers: SpriteData3
//video modes 3-5, OAMData starts at 0x6010000 + 8192 unsigned short* SpriteData3 = SpriteData + 8192;
Magic numbers in code are generally a bad idea. When it comes to
tile addresses, magic numbers are very common for some reason. At the
very least, one should make a macro that takes care of the raw math,
so that you just have to enter the charblock and tile index to get the
right address. Even better would be to define TILE and
CHARBLOCK types so that you can map VRAM into tiles.
By doing so, they can be used not only for addressing but also for
copies (tonc:tileblocks).
But even if you do use magic numbers, it's better to use hex: 8192 doesn't mean that much. And you have to be very careful with datatypes because pointer addition works by type, not by byte. In particular, SpriteData+0x2000 = 0x06014000, not 0x06012000, because the datatype of SpriteData is u16. The code is correct here, but the comment is not.
7.3 BounceSprite main.c (pp 246 - 251)
void MoveSprite(int num) { //clear the old x value sprites[num].attribute1 = sprites[num].attribute1 & 0xFE00; sprites[num].attribute1 = sprites[num].attribute1 | mysprites[num].x; //clear the old y value sprites[num].attribute0 = sprites[num].attribute0 & 0xFF00; sprites[num].attribute0 = sprites[num].attribute0 | mysprites[num].y; } ... //draw the background for(n=0; n < 38400; n++) videoBuffer[n] = bg_Bitmap[n]; //set the sprite palette for(n = 0; n < 256; n++) SpritePal[n] = ballPalette[n]; //load ball sprite for(n = 0; n < 512; n++) SpriteData3[n] = ballData[n];
Incorrect/bad code: MoveSprite
Real C programmers know how to use compound assignment operators like &= and |=. Not using them makes the code longer, which is generally more difficult to parse, which makes it easier to make errors. Additionally, the new sprite coordinates should be masked; otherwise they can overwrite the rest of the bits (tonc:obj-position).
7.4 TransSprite (pp 258 - )
Bad design methodology: InitSprites()
If you want to make general functions, make sure that what they do
makes sense and don't set bits that have no place in them. For example,
InitSprite() forces objects to be transparent. This
should be dealt with via a parameter, not forced. It's probably better
for everyone to just have a number of parameters that mimic the
attribute settings. This limits the amount of function arguments and
shortens the code significantly.
Incorrect functionality: SetTrans() and SetColorMode()
Only from the code could you tell these have something to do with objects. Aside from that, what they do is not general enough. What should be done here is mask out the relevant bits and then mask the new bits in. As it is now, they will only update attr0 with position, transparency and color-mode settings, and erasing the other bits. Bits like the shape information. In short, the functions are badly named, slow and incorrect.
Magic numbers: REG_BLDMOD and REG_COLEV
It'd be nice to make some register #defines for these things. Not
everyone will know that (1<<4) will mean object
top-layer transparency.
Bad struct design: SpriteHandler
While using ints is usually good, one place where the
case isn't always so clear-cut is in structs. The thing about structs
is that the cost memory space, and IWRAM space at that. The
SpriteHandler struct takes up 8 words, and the array
128*8*4 = 4 kiB. That's already 1/8th of IWRAM. Now, sometimes this
is okay because, well, sometimes you just need large amounts of data.
But the members alive, colormode and
trans are all single bits, and size only requires 2 bits.
So about half of the struct is always empty. The solution is
bit-packing, which is how most of GBA memory works anyway. It should
be done here as well.
Bad names: dirx/diry
These aren't directions; they're velocities. Directions don't have magnitudes; vectors like velocities do. Additionally, directions don't necessarily imply movement: positions have a direction too, as does looking.
7.5 Rotation and Scaling (pp 265 - 266)
Incorrect explanations
“Is there really any need to draw pre-rotated sprites anymore when support for rotating a sprite is built in to the GBA hardware?”
Yes, it is. There are only 32 affine matrices for objects, which may not be enough of all your sprites. Also, the affine transformations don't give very smooth results and have a number of artefacts that can look very ugly indeed (tonc:affine objects).
“The process isn't perfect because the GBA doesn't have a floating point processor …”
That's not the reason it's not perfect. The process is one of sampling; sampling can cause artefacts because that's just how sampling works, unless you do some funky anti-aliasing.
“… so all the rotation must be done in fixed-point math.”
Which is a non-obvious subject for most nowadays, and should be explained a little. Preferably when it's first used, which was in chapter 6.
“But that can be solved easily enough with a precalculated array of sine and cosine values.”
Yes, and this is exactly what was done in the previous chapter. This discussion of how GBA rotation works should have gone there.
“This is a refinement over the SIN and COS arrays you saw in the previous chapter, as there is no longer any need for a source file containing these radian values since they're just computed at the start of the program. This does cause a slight delay ”
Apart from the delay, everything in this sentence is wrong. What's described here is to precalculate the sin/cos arrays in-game, rather than precalculate them on the PC. This does cause a slight delay because the GBA is very bad at floating point math, which is necessary to build the arrays. But that's the least of your trouble. Because they're build in-game, the arrays will go into IWRAM as well instead of ROM. In this case, That's 2*360*4 = 2880 bytes of IWRAM wasted on what's essentially constant data. Not only is this a waste, it's a waste of waste because
- You only need either a sine or cosine table, as they're basically the same.
- At .8 fixed-point, you'll only have 9 significant bits. The other 23 are essentially wasted.
To build them outside the game and link them in is the right thing to do. To build them in-game is not a refinement, it's a detriment.
Improper typing/naming: RotData
The types here should be signed (tonc:affine-types)! Additionally, it isn't rotation data: it's affine data, as discussed earlier (tonc:affine).
7.6 RotateSprite
Wasted memory / missing member: SpriteHandler (pp 270)
There are two new members here: rotate and
scale. Contrary to what you might think, rotate
isn't the angle; it's the flag marking the sprite as an affine sprite,
which is completely unnecessary because the sprite functions in this
project only work with affine sprites and are generally incompatible
with the functions of the other sprite projects. The real angle is
kept in a member called angle, which is, in fact, missing
from the SpriteHandler definition, so the code wouldn't
compile.
Button #defines (pp 271)
It would be much easier if these were in hex.
InitSprites() and ROTDATA() (pp 274)
The function uses ROTDATA(tileIndex) to indicate the
affine matrix number. Affine indices max-out at 32; tile indices can
be up to 1024 and will be greater than 32 very often, making it a
bad idea to use tileIndex like this. Bonus points
for ROTDATA() not masking its input, so that the size-bits
would be overwritten as well.
RotateSprite (pp 275)
Amazingly enough, this one actually does what it's supposed to.
Inefficient/bad/incorrect code: CheckButtons()/ Pressed() (pp 276)
It's a good practice to read REG_KEYINPUT into a variable
and use that for interaction. This way, you can go to active-high
status, and do more complicated things like test for key-helds and
releases and the like
(tonc:keys-advanced).
However, the way it's done here makes little sense and only serves
to complicate matters. The BUTTON_foo #defines
are bit-#defines. So to check for key-presses all you have to do is
mask out that bit. This is how it was done in previous projects.
This time round, however, each button is read out into an element of
an array in CheckButtons(), and then check for
down-status in Pressed() using a switch-block. Let
me repeat: instead of a simple AND for the button we want, we now
have 10 ANDs, and a 10-fold switch-block that checks an array element,
using the bit-defines as a selection criterion. Not only is this
much more complicated and slow, it blocks checks for multiple
buttons, could be better implemented using button-enums instead of
the button #defines so that you could use a simple look-up instead
of a switchblock, and it again wastes memory because the array
uses ints where only a single bit is ever used.
Bonus points for missing a few array-deferences ([1] - [4]), making this code not even compile.
Incorrect comments: SetMode() (pp 277)
//set the video mode--mode 3, bg 2, with sprite support SetMode(2 | OBJ_ENABLE | OBJ_MAP_1D);
This actually sets the video-mode to 2 (which is a good thing because
the tile indices would be 512 or higher, making InitSprites()
fail) and no background.
7.7 Animated Sprites (pp 279 - 280)
Incorrect information frames and OAM
“The easiest way to animate sprites is to copy a particular frame on an animation sequence into OAM so that it is rendered during the next screen refresh.”
Not quite. The easiest and quickest way would be to have all the
frames in VRAM already, and update the tile index. This won't always
work because there may be too many frames, in which case you'd have
copy new frames into VRAM. Note, copy into VRAM! OAM stands for
Object Attribute Memory, which is at 0700:0000. The animation
frames go into 0601:0000, which is part of VRAM.
7.8 AnimSprite main.c (pp 283 - 289)
Incorrect code: WaitVBlank() (pp 286)
This is the fourth version of WaitVBlank(), and it's
still incorrect. However, you can see that it's progressed
somewhat: it waits for a particular scanline and then waits until it's
over, so that it can be used for timing. The only problem is that the
scanline it checks is line 0, not line 160, so it waits for the VDraw,
not the VBlank.
Improper types/magic numbers: UpdateBall()
You should never, ever use non-ints for loop variables. In this case, it slows down an already slow loop by another 10-20%. And then there's the matter of the magic 512 … what is it? Also, this only updates the first few tiles of Object VRAM, so that it's not generally usable.
8 Chapter 8: Using Interrupts
Incorrect info: timer importance
Then I talk about the all-important subject of timers, how to slow down your program to a consistent framerate. This has been something of a problem in the prior chapters (aside from using the VBlank), but now you will have the means to correct it.”
The GBA timers have nothing to do with keeping a consistent frame rate. Synchronizing should use the VBlank, the problem in the prior chapters was that not one of the projects did this correctly.
8.1 Using Interrupts (pp 296)
Incorrect info: software interrupts (pp 296)
“A software interrupt is common in a multitasking operating system like Windows 2000 or XP. Since the GBA is a video game machine, as you may have expected, all interrupts occur on the hardware side.”
This is not true. There are indeed software interrupts, also known as BIOS calls (gbatek:bios-functions).
Incorrect info: REG_DISPSTAT{VCOUNT} (pp 297)
The VCount trigger in REG_DISPSTAT used bits 8-15, not
6-15.
Incorrect info: bit-ops (pp 298)
“The hexadecimal values in this list of definitions allow you to perform a bitwise AND with the REG_IE register in order to set the specific bit …”
Bitwise AND clears other bits; to set, use bitwise OR.
Not quite correct info: multiple interrupt timings (pp 298)
One and only one interrupt will occur at a time! So the REG_IF register will only have one bit set, not several. …”
This isn't completely true. Yes, only one interrupt will fire at a time (unless you get really unlucky), but multiple bits in REG_IF may be set. This can occur when dealing with nested interrupt routines, or if you get an interrupt but don't acknowledge it.
Incorrect info: acknowledging irqs (pp 299)
REG_IF |= INT_TIMER0;
To acknowledge an interrupt, you need to write that bit to REG_IF,
not use an OR-EQ. So that's `=', not `|='.
Yes, this is a little odd, but that's how it really works. Using
`|=' would in fact clear all the interrupts.
Code silliness (pp 300)
if((REG_IF & INT_HBLANK) == INT_HBLANK)
When using a single-bit mask, you don't have to use the ==
and such. After all, it can be nothing else but that or 0, and these
will be enough.
8.2 InterruptTest (pp 301 - 305)
Proper typedefs
The standard typedefs for (un)signed char/short/int are finally here, yay!
Improper typing of registers
IO registers should always be volatile. Especially interrupt registers. That means you, REG_IE, REG_IF and REG_IME.
Bad, bad code for interrupts: MyHandler plotter (pp 305)
The object here was to draw a random pixel at every HBlank. As the
author indicated, “… hblank code must be fast!”. So
the last thing you should be doing is call rand()
3 times, call modulo 3 times, use a slow function, and do all these
things in ARM code from ROM. The whole routine will probably take
about 4-8 scanlines to execute.
8.3 Using Timers (pp 306 - 307)
Incorrect code: timer check (pp 306)
timer = REG_TM0D; if (timer % 65536) { // overflow--time to deal with it }
Timers run continuously so they will be a multiple of 65536 exactly is next to zero (well, 1/65536, really). Meaning that this code will execute pretty much all the time.
8.4 TimerTest (pp 307 - 314)
int timers; timers[0] = REG_TM0D / (65536 / 1000); timers = REG_TM1D; timers = REG_TM2D / (65536 / 1000); timers = REG_TM3D;
Improper types: timers
Again, array operators have gone somehow.
Problematic code: integer division
Because the 65536/1000 division is done first, you get a 1% inaccuracy
since this rounds down to 65. It'd have been better to use
`REG_TM0D*1000/65536'.
8.5 The Framerate program
Incorrect info: VBA accuracy (pp 315)
“I think this program demonstrates that the VisualBoyAdvance emulator is working perfectly, because a consistent framerate of 60 FPS comes through when the VBlank is used.”
The problem is that it's well known that VBA is inaccurate in its timings. Keeping a consistent frame rate is the easy part for an emulator, because you just update REG_VCOUNT and those kinds of timers with the PC clock.
Incorrect code: framerate counter (pp 327)
timer = REG_TM3D / (65536 / 1000); frames++; if (timer > 999) { ... //display frame rate sprintf(str, "FPS %i", frames); Print(1, 1, str, 0xFFFF); frames = 0; }
This code doesn't work as a framerate counter. The timer is set to 256 clocks, in other words about 1098 timer increments per retrace (308*228*4 / 256). The timer variable here has a maximum of 65536/65 = 1008. And on average will increment by 1098/65 ˜ 18. Because the check is done at 999, it can miss the overflow, using more frames in a 'second'. If you want a more accurate second timer, use cascaders, like here: (tonc:timer-demo).
9 Chapter 9: The Sound System
Don't know too much about sound programming, so I'll leave this chapter for later.
10 Chapter 10: Interfacing with the Buttons
10.1 Detecting Button Input
Incorrect info: REG_KEYINPUT size (pp 361)
“The button status value at 0x04000130 is a 32-bit number.”
No, it's not. REG_KEYINPUT is 16-bit. The bits at
0400:0132 belong to REG_KEYCNT.
“By the way, almost all locations are 32-bit because the GBA is a 32-bit machine.”
Yes, the GBA is a 32-bit machine, but that has little bearing on how memory is being used or the bus sizes. It's true that the addresses themselves are 32-bit numbers, but most of the time they're accessed as halfwords. With few exceptions, the book used halfword pointers for everything.
Improper variable/inconsistency: BUTTONS (pp 362)
Again, a variable where a direct #define would have been better. In earlier projects, it was indeed a #define. On the plus side, the book finally mentioned the importance of the volatile keyword.
Misc dumbness: use of REG_KEYINPUT (pp 362-368)
While I understand the value of rhetorical style elements such as
presenting a problem and then explaining it, it's just dumb to present
something as a problem when the only issue is that you didn't even
look up what you were dealing with in the first place. In this case,
there has never really been any doubt about how the
REG_KEYINPUT operates: it encodes the status of the 10
keys in of its 10 bits. This is clearly explained in
gbatek:keyinput,
which should always be the first place to look up something new (at
least, for tutorial writers; tutorial readers should be able to use
the info that the tutorial writers have taken from gbatek).
The ‘problem’ in this case is what `*BUTTONS'
does exactly, and while the demo in pages 363-367 does hint at that,
the resulting explanation is a little silly.
“What this means is that the bits are set to 1 by default, and are individually set to 0 when a button is pressed. The problem is that the GBA sets status bits based on position, rather than just storing an arbitrary value! That's right. So while you might expect to see buttons Start=1, Select=2, and perhaps A=3, B=4 and so on, that's not really what's going on.”
In other words, the GBA makes use of a sensible solution to the problem at hand, rather than trying to do it in a complicated and ultimately problematic way. Using bitmasks is what should have been expected. The alternative (indices) may be used in a PC environment, but that's only because the bit-mask solution doesn't really work well when there are so many keys.
The only surprising thing here is the active-low setting (0 for a press instead of 1), but that point is overlooked completely.
10.2 Correctly identifying buttons (pp 369)
Instead of looking up the register in the appropriate reference document, the book describes a way to find it out yourself, using this code:
while(1) { //check for button presses for (n=1; n < 1000; n++) { if (!((*BUTTONS) & n)) { sprintf(str, "BUTTON CODE = %i ", n); Print(10, 40, str, BLUE); break; } } }
“The first thing you will have noticed is that the loop goes from 1 to 1000. That is just an arbitrary number that will accommodate all 10 buttons, because I don't know which bits the GBA uses for each button.”
The code serves to find out which bit belongs to which button. This is a stupendously poor way of finding that out. All you have to do is print out the contents of *BUTTON (or even better, invert the number before printing) in hex and you're done. Really, that's it. No loops; just print it in hex and you can read the bits right off that. Alternatively, if your hex is a little rusty, you could print out the status of the 10 bits as a bit string.
“This is a really good lesson in dealing with low-level hardware interfaces.”
No, it wasn't. When dealing with low-level hardware you're working in bits, so you use hex. Always. It does, however, explain why the table for button bits is given in decimal, rather than hex as it should have been.
10.3 Creating a button handler
Separating buttons into an array (pp 374)
Since button input is a very low-level aspect of programming the GBA, it's helpful to move the actual memory reading code into a function and store the results of the buttons in an array. … The main benefit of polling all the buttons at the same time is that you are more likely to lose a button if there is a lot of code between each poll.”
The motivation is good, the implementation isn't. It's already covered in an earlier chapter, but if you're only interested in the status (that is, a one-bit value), it makes sense to keep it in bits and pack it into one variable. This makes subsequent use so much easier than if they're separated into different variables, not to mention faster. As far as I can see, the only time a button-array is useful is if you need to know how long they're pressed or something like that.
10.4 The ButtonHandler Program
Bad info: key releases
“As you have seen, the button handler need not be complicated …”
No, it needn't, but the version in the book is much more complicated than necessary. For easy implementations, see tonc:keys.
“… unless you want to detect button releases separately from presses.”
Nope, it's quite easy as well … if your handler is simple. Two variables and two ANDs is all it takes. Additionally, note that ‘release’ is used opposite of ‘press’ here. But the book's handler doesn't check presses (that is, signal when the button if going down), just the key status itself; whether it's up or down. Again, see tonc:keys for how it should go.
“… because when it comes to GBA coding, all that matters is responding to a button press.”
No. Key releases are very important for things like charging and combos.
The rest of the chapter concerns a combo detector, which is interesting, but doesn't suffer from anything that hasn't been covered before. All in all, this chapter has been 30 pages of stuff that could have easily been covered in 10, and covered better and in more detail as well.
11 Chapter 11: ARM7 Assembly Language Primer
Incorrect name of chapter
The basic idea behind a language primer should be to show how the language works. This chapter doesn't. There is next to no information about the instruction set, what it can and cannot do, or how to properly integrate it with C. What is does do is (badly) show how you can assemble a file, and show some very slow assembly routines. For a more useful introduction in the nuts and bolts of ARM assembly, see tonc:asm.
Bad practice: batch files (pp 396)
While batch files are indeed easier to work with than makefiles for small programs, for anything beyond the most simplistic stuff, makefiles are far better. Giving them up for batchfiles is a step (possibly even several) in the wrong direction. Using batchfiles like it's done here – as the place to collect the compiler flags – is a particularly bad idea because it means you still have to run several of them to complete a build.
Bad info: assembly difficulty (pp 400)
“Indeed, assembly is difficult to master, because each assembly instruction translates directly to a CPU instruction!”
Actually, ARM assembly is pretty easy. In many ways, easier than C because the format is simpler. There are no chances of operator precedence snafu's or typing problems, because each operator is an instruction and there is only one type: the word.
Bad code: linker batchfile (pp 403)
The linker batch file can only be used on one file, but Real Projects usually have multiple files.
As an example of how things should go, see the devkitPro or third-tier tonc template makefiles. If these are a little too advanced, use the second-tier tonc makefiles, which are still relatively easy to understand for makefile newbies.
11.1 FirstAsm Program (pp 406)
I can't be 100% sure, but I'm quite confident that this code of this
was simply taken directly from the gcc output. The presence of
decimal numbers for addresses is usually a dead giveaway. These
should have been converted to more reader-friendly hex-values.
The value 67108864 means nothing; but its hex
representation, 0x04000000 is, of course, recognizable
as the IO register base. The same is true for 33554432.
According to the comments, it's supposed to be VRAM, but
33554432 is 0x02000000, not
0x06000000. So I guess something went wrong here.
Also note the complete lack of explanation of the code, and why it's
written the way it is. Why, for example, you can't just do
`mov r2, #1027', but have to use a mov and
an add? Well, it so happens that ARM cores can't use
use constant values directly if they span more than a byte. It's
things like these that make assembly tricky to use, and therefore
where explanatory texts should focus on.
11.2 DrawPixel32 (pp 409)
.ARM
.ALIGN
.GLOBAL DrawPixel32
DrawPixel32:
stmfd sp!,{r4-r5} @ Save register r4 and r5 on stack
mov r4,#480 @ r4 = 480
mul r5,r4,r1 @ r5 = r4 * y
add r5,r5,r0,lsl #1 @ r5 = r5 + (x << 1)
add r4,r5,r3 @ r4 = r5 + videobuffer
strh r2,[r4] @ *(unsigned short *)r4 = color
ldmfd sp!,{r4-r5} @ Restore registers r4 and r5
bx lr
Incorrect name
The function has nothing to do with anything 32-anything. Usually, a 32 affix means there are 32-bit access somewhere, but this function doesn't have them. What's probably meant is that this was done with the ARM instruction set (32-bit) instead of Thumb (16-bit instructions), but that's mostly irrelevant from C's perspective.
Improper typing: videobuffer
The videobuffer parameter is an u32. Should have been a pointer. For the asm it doesn't matter, but it makes the use in C easier.
Inefficient code: DrawPixel32()
This is quite possibly the worst implementation for plotting
mode 3 pixels you could find. The code has 8 instructions; only 5 are
necessary. You can do everything with 4 registers so that it's not
necessary to use the stack. The multiplication can also be replaced
with something simpler, because it can be done with shifted
rsb and add's. It should have read:
// declaration void DrawPixel32(u32 x, u32 y, u32 color, u16 *dst); ... add r0, r3, r0, lsl #1 @ r0= dst+x rsb r1, r1, r1, lsl #4 @ r1= y*15 add r0, r0, r1, lsl #4 @ r0= &dst[x + y*15*(32/2)] strh r2, [r0] @ dst[x+y*240]= color bx lr
In fact, that's what I get when I get from devkitArm's compiler now. The whole point of using assembly is to create code that'll run faster than a compiled equivalent. If you're making code that's actually slower, you're doing something very, very wrong.
12 Conclusions
Well, that's about it I guess. The points mentioned here are the ones I could find on a first full read. While I admit that some of these points are more valid than others, I do think that each of them is worth thinking about. In the future I may add one or two points I've missed, or remove ones that are really just nitpicking.
hello...
wonderful...
Thanks for this. We used this book in a course for school, and I've been trying to explain why the book wasn't a great reference.
How Offset printing Works...
The demand for quality print and fast turn around time is always a requirement set by customers. No matter what the cost it may be all they want is to achieve the satisfaction and have the quality they want for their materials....
Jessie...
Thanks for the info. By the way, I am a big fan of your site. Keep up the great work....
Eric...
Thanks for sharing. I agree and would add that th...
covering letters format...
As a result, TrackBack spam filters similar to those implemented against comment spam now exist in many weblog publishing systems....
It's easy to stick a flag on top of a building, making observations about all of the design flaws in the building while taking the elevator ride up to the top, but in the end, all you have done is affixed a flag on top.
First of all, this e-book was professionally edited and formatted, and then given away for free. I asked for a donation for the sources but have given away many copies of the sources for free to anyone who asked.
Secondly, the book was written to be easy to understand, not highly optimized. The material is very challenging, very low level, and maybe didn't succeed at being easy to understand. There are some glaring omissions, like lack of a decent sprite animation example with multiple sprites. After all these years, I've added a few more example games and demos, including a very nice sprite class and sprite handler, as addendum.
That being said... what's the point of this website? Did you pay for the book? Are you helping anyone by pointing out all the flaws in the free book? I disclaimed it in the introduction and first chapter that this book would teach the basics of the GBA. Even though it's not optimized code, I have gotten emails from guys who have used it to get a job working on handheld projects at game studios. I got an email from someone at Firaxis Games about it. There are teachers out there using this book and MY sources to teach courses without even asking, without even crediting MY hard work, let alone sending me a little email like "Hey, using your GBA stuff, it's been helpful, thanks." Nada. Nothing but takers...like YOU.
I have a suggestion: Write your own book. If it's so easy, as you seem to believe with this errata-or-whatever site, I'm sure you would have no problem writing a book yourself--one that will be 100% balanced on both sides of the equation.
You're just angry cos you know he's right.
Hey, pointing out flaws makes things better. Where would Open Source be without it?
Also, if you can't handle (a little) criticism, don't write anything on anything. I could see him ripping much deeper.
(And e-ghasp, he never attacked you personally, just your writing!)
"That being said... what's the point of this website?"
well, for starters... this WEBPAGE (not the site) was made to point out to absolute newbies that your book has some (serious) flaws in it, and that if they DO happen to learn off your book, to be wary of the flaws.
"I have gotten emails from guys who have used it to get a job working on handheld projects at game studios."
I believe that but... did they keep using YOUR coding practices? Doubt it, 'cause, like the page pointed out, the practices are... lame to say the least.
"Nada. Nothing but takers...like YOU."
Umm... you DO realize that's a personal insult yet he never insulted you, right?
"I have a suggestion: Write your own book. If [...]"
He pretty much did. Look at TONC, or are you THAT angry at him for writing up a great GBA tutorial?
Except that I'm not planting any flags; I'm trying to point out the flaws in the building. If in a building the floors are uneven, the support structures are rotten and the elevator cable is showing signs of fatigue, it's worth pointing those out to the tenants. In fact, it could be considered irresponsible not to.
I'm not attacking the book on its formatting, spelling, grammer or layout; I'm talking about its mistakes and bad programming practices. The fact that it's free does not make those things go away.
The book is easy to understand; that's not the problem. The problem is that much of the information is incorrect or leads to poor code. And I'm not asking for ‘highly’ optimized. I understand that optimization often comes at the cost of readability, but when it doesn't you might as well give the better version. This ultimately helps the readers, as they won't have to spend time finding a suitable replacement when the original function proves insufficient.
For instance, the simple addition of
inlineto the pixel plotters already speeds things up by roughly a factor of three. This kind of thing comes at no expense to the reader at all. (And yes, I know that this is indeed done in the demos written later. But it's still absent from the book.)Another example is the mode 3 text system. The font that's advertised as having a low memory footprint is actually sixteen times too large and takes up a substantial bit of IWRAM. A 1 bpp bitpacked font would have worked out much better. If bitwork is considered too difficult, the font could have at least been compressed to bytes rather than halfwords. This could have been done by changing the declaration from `
unsigned short font[]' to `const unsigned byte font[]'.Yes, I am.
I care deeply about correctness, efficiency and taking heed of recommended practices. I believe these things to be beneficial to the quality of whatever one is working on. This is especially true for GBA programming, where resources are sparse. It's part of why I like it.
The point is, sloppiness is bad, and it deserves to be pointed out. this way, people with less experience can both avoid the bad practice, and perhaps recognize it when they encounter it somewhere else. I really do not see how this can be considered a bad thing.
When it comes to learning something new, most people will simply copy what their sources tell them. In most cases this works out fine, but if the information has flaws, those flaws will be copied as being factual and possibly spread.
The GBA homebrew scene is a good example of this. In the beginning, there was PERN. This was very easy guide to the basic elements of programming for the GBA, but it did contain some factual errors and quite some bad programming practices and style. Nearly all GBA programming guides that followed inherited some or all of its problems. This includes your book, which fortunately has done away
with some of the problems, but unfortunately has added a few new ones as well.
People have been basing their code on these guides and also adopted their bad practices. Dozens, perhaps hundreds of threads on the gbadev forum are related to these issues. Many people, me included, have been offering solutions to these problems; problems that would have never come up if people (especially the writers) had simply followed standard programming practices in the first place.
Part of the reason I wrote Tonc was to warn people about the problems found in other tutorials, how to do things properly and why. This page is simply an extension of that. I could have written one for some of the other guides as well, but since the others don't have the presence that your book has (with good reason: it is superior in layout, presentation and coverage), I find it the only source that's worth doing it for.
I know I sometimes sound … irritated in the text. This is because some of the mistakes simply confound me. For example, there are several different methods of vertical synching in the book and they're all wrong! That a well-willing amateur makes such a mistake is bad enough, but to see a professional writer and computer science teacher do this just boggles my mind.
The only reason I can think of for this problem is that they were simply copied blindly from other tutorials without checking if they worked. You can't expect a student to catch something like this when the teacher has missed it repeatedly. This just illustrates the need for a document like this one.
Yes, I know. That's part of the reason this page exists: to try to prevent erroneous information from being passed on to the students. For the record, just as there are people for whom it's been helpful, there are other who would like it to go away. See here, for example. I am not alone in this.
Please point out anything that I've taken from the book other than for the purpose of this document.
In a manner of speaking, I have.
I know very well how hard it can be, and how it feels when someone speaks badly about it. But just because it is hard work doesn't mean it can't be criticized. In fact, when it comes to a textbook, I think that flaws should be pointed out and if possible fixed. Students should be able to trust a textbook to give them accurate information and correct procedures. Feel free to disagree, but I believe it is the responsibility of an author to check the book's contents and to fix or at least point out any problems that are were missed at its release. If he does not, then somebody else might do it for him.
To Jonathan:
Instead of being so defensive, why don't you just fix the issues in your book? You are a professional programming author, so surely you care about correctness? In this case, the reason the GBA programming community recommends against your book, and highly recommends Cearn's TONC is because his is correct, and yours is not. We all make mistakes, and graciously accepting feedback and correction shows that you care about the quality of your work, and improves both your own knowledge, and allows the entire community to benefit. Cearn has corrected me multiple times, and I have learned and benefited from it.
Fighting back in the manner you have done accomplishes nothing for yourself or the community -- other than turning people away from the rest of your books. Telling Cearn that he is "Nothing but a taker" is but a sophomoric jab, as he has spent a lot of time and energy providing the GBA community with the best tutorial that exists.
So if you'd like us to take your book (or any of your other published books) seriously, put down your ego, learn from others, and everybody wins.
Leave a comment
Powered by WordPress