Some new notes on NDS code size

When I discussed the memory footprints of several C/C++ elements, I apparently missed a very important item: operator new and related functions. I assumed new shouldn't increase the binary that much, but boy was I wrong.

The short story is that officially new should throw an exception when it can't allocate new memory. Exceptions come with about 60 kb worth of baggage. Yes, this is more or less the same stuff that goes into vector and string.

The long story, including a detailed look at a minimal binary, a binary that uses new and a solution to the exception overhead (in this particular case anyway) can be read below the fold.

 

1 Minimal project

The following is essentially an empty project. It should represent the smallest binary you can get with the current DKA (r26) and libnds (1.3.7). This is the primary reference case.

#include <nds.h>

int main()
{
    while() ;
}

This actually already leads to a binary of 53.5 kb. To analyze what goes on in there, we can look at the map file. Not the mapfile generated by the linker, mind you, but by the arm-eabi-nm tool, whose generated files are considerably easier to read. To use this tool, add the following line to $(BUILD) rule in the makefile, so that it looks like below. If you want to know what the flags mean, please RTFM.

$(BUILD):
    @[ -d $@ ] || mkdir -p $@
    @make --no-print-directory -C $(BUILD) -f $(CURDIR)/Makefile
    arm-eabi-nm -Sn $(OUTPUT).elf > $(BUILD)/$(TARGET).map

And this is the resulting mapfile, in full.

         w _Jv_RegisterClasses
         w __deregister_frame_info
         w __register_frame_info
00080000 N _stack
01000000 A __vectors_end
01000000 A __vectors_start
01000100 A __itcm_start
01000100 000000c8 T irqTable
010001c8 T IntrMain
010001fc t findIRQ
01000218 t no_handler
01000228 t jump_intr
0100023c t got_handler
0100025c t IntrRet
01000290 A __itcm_end
02000000 T __text_start
02000000 T _start
02000194 t ILoop
02000198 t checkARGV
020001dc t .copyforward
020001f0 t .copybackward
02000200 t .copydone
02000214 t ClearMem
02000228 t ClrLoop
02000238 t CopyMemCheck
0200023c t CopyMem
0200024c t CIDLoop
02000300 T _init
02000310 t __do_global_dtors_aux
0200033c t frame_dummy
0200037c 00000004 T main
02000380 000000ec T initSystem
0200046c 00000012 T ledBlink
02000480 0000002c T powerOff
020004ac 00000030 T powerOn
020004dc 00000018 T systemSleep
020004f4 00000010 T powerValueHandler
02000504 00000044 T systemMsgHandler
02000548 00000164 t fifoInternalSend
020006ac 00000038 T fifoSendAddress
020006e4 00000048 T fifoSendValue32
0200072c 00000070 T fifoGetAddress
0200079c 00000074 T fifoSetAddressHandler
02000810 00000070 T fifoGetValue32
02000880 00000074 T fifoSetValue32Handler
020008f4 00000024 T fifoCheckAddress
02000918 00000024 T fifoCheckDatamsg
0200093c 00000024 T fifoCheckValue32
02000960 00000094 t fifoInternalSendInterrupt
020009f4 00000010 t __timeoutvbl
02000a04 000001b8 T fifoInit
02000bbc 00000100 T fifoGetDatamsg
02000cbc 0000040c t fifoInternalRecvInterrupt
020010c8 000000a8 T fifoSetDatamsgHandler
02001170 00000070 T fifoSendDatamsg
020011e0 00000002 T irqDummy
020011e4 0000006c T irqSet
02001250 0000004c T irqInit
0200129c 00000030 T irqInitHandler
020012cc 00000060 T irqEnable
0200132c 00000060 T irqDisable
0200138c 0000002c T irqClear
020013c0 T swiSoftReset
020013c4 T swiDelay
020013c8 T swiIntrWait
020013cc T swiWaitForVBlank
020013d0 T swiSleep
020013d4 T swiChangeSoundBias
020013d8 T swiDivide
020013dc T swiRemainder
020013e2 T swiDivMod
020013ee T swiCopy
020013f2 T swiFastCopy
020013f6 T swiSqrt
020013fa T swiCRC16
020013fe T swiIsDebugger
02001402 T swiUnpackBits
02001406 T swiDecompressLZSSWram
0200140a T swiDecompressLZSSVram
0200140e T swiDecompressHuffman
02001412 T swiDecompressRLEWram
02001416 T swiDecompressRLEVram
0200141a T swiWaitForIRQ
0200141e T swiDecodeDelta8
02001422 T swiDecodeDelta16
02001426 T swiSetHaltCR
02001430 00000030 T __libc_fini_array
02001460 00000050 T __libc_init_array
020014b4 00000080 T memcpy
02001534 00000006 T _times_r
0200153c 0000002c T _gettimeofday_r
02001568 00000014 T _times
0200157c 00000052 T build_argv
020015d0 0000000c T __errno
020015dc T _fini
020015e8 A __text_end
020015e8 00000004 R _global_impure_ptr
020015f0 A __exidx_end
020015f0 A __exidx_start
020015f0 t __frame_dummy_init_array_entry
020015f0 A __init_array_start
020015f0 A __preinit_array_end
020015f0 A __preinit_array_start
020015f4 t __do_global_dtors_aux_fini_array_entry
020015f4 A __fini_array_start
020015f4 A __init_array_end
020015f8 r __EH_FRAME_BEGIN__
020015f8 r __FRAME_END__
020015f8 A __fini_array_end
020015fc d __JCR_END__
020015fc d __JCR_LIST__
02001600 A __data_start
02001600 D __dso_handle
02001600 A __ewram_start
02001604 00000004 D fifo_freewords
02001608 00000004 D fifo_send_queue
0200160c 00000004 D fifo_buffer_free
02001610 00000004 D fifo_receive_queue
02001618 00000004 D _impure_ptr
02001620 00000428 d impure_data
02001a48 A __bss_start
02001a48 A __bss_start__
02001a48 A __bss_vma
02001a48 A __data_end
02001a48 A __dtcm_lma
02001a48 A __itcm_lma
02001a48 b completed.2775
02001a4c b object.2787
02001a64 00000004 b __timeout
02001a68 00000004 B processing
02001a6c 00000004 B fake_heap_end
02001a70 00000004 B fake_heap_start
02001a74 00000004 B theTime
02001a78 00000040 B fifo_datamsg_data
02001ab8 00000800 B fifo_buffer
02001bd8 A __vectors_lma
020022b8 00000040 B fifo_value32_func
020022f8 00000040 B fifo_address_func
02002338 00000040 B fifo_value32_data
02002378 00000040 B fifo_value32_queue
020023b8 00000040 B fifo_data_queue
020023f8 00000040 B fifo_address_data
02002438 00000040 B fifo_datamsg_func
02002478 00000040 B fifo_address_queue
020024b8 00000004 B punixTime
020024bc A __bss_end
020024bc A __bss_end__
020024bc A __end__
020024bc A _end
023ff000 A __eheap_end
023ff000 A __ewram_end
027fff70 a _libnds_argv
0b000000 A __dtcm_end
0b000000 A __dtcm_start
0b000000 A __sbss_end
0b000000 A __sbss_start
0b000000 A __sbss_start__
0b003d00 A __sp_usr
0b003e00 A __sp_irq
0b003f00 A __sp_svc
0b003ff8 A __irq_flags
0b003ffc A __irq_vector
0b004000 A __dtcm_top

Now, I expect you can't really tell much from this, so here's a summary.

[map]
begin       end         size      Description
02000000 - 0200033c             : crt0.S (roughly)
0200037c - 02000380     0004    : main.c
02000380 - 02000548     01C8    : libnds system init/handlers
02000548 - 020011e0     0C98    : libnds fifo routines
020011e0 - 020013c0     01E0    : libnds interrupt.c
020013c0 - 02001430     0070    : libnds bios.s
02001430 - 020015e8     01B8    : libc misc
020015e8 A __text_end
020015e8 - 02001600     0018    : C/C++ ctor/dtor overhead, etc?
02001600 - 02001618     0018    : libnds fifo data
02001618 - 02001a48     0430    : impure ?!?
02001a48 - 02001a78     0030    : misc bookkeeping

02001a78 - 020024b8     0A40    : libnds fifo data + pointers
020024bc A _end

000024bc - 0000D630 0000B174    : ???
[/map]

The 0100:xxxx and 0B00:xxxx ranges belong to ITCM and DTCM, so those are irrelevant when looking at main RAM size. The libc, impure and misc bookkeeping sections are stuff related to the C library and C overhead, accounting for about 1.5 kb. The boot code, crt0.S also covers close to 1.0 kb. As expected, the code for main.c –the actual project– is more or less nothing.

The rest, about 7 kb, is libnds. Now, you may say that this is quite a bit of overhead, but it really isn't. Pretty much all of it relates to interrupts and the fifo system, which takes care of ARM7-ARM9 communication. You need to have these parts. Okay, you could try to roll your own to shrink this down to the bare essentials, but in all likelihood that's more trouble than it's worth.

 

The observant of you should have noticed something: we're only at 9.5 kb, but the file size is 53.5 kb. So what the hell happened to the other 44 kb? Well, I don't know, to be honest. It doesn't appear in MWRAM to be sure. It's probably the stuff ndstool adds. My guess it that that's where the ARM7 binary goes, along with the icon, titles and possibly DLDI interfaces, but I really can't say right now.

2 Standard C++ new/delete

And now, let's look at what happens when you invoke new.

void test_std_new()
{
    u8 *ptr= new u8[8];
    delete[] ptr;
}

int main()
{
    while(1) ;
}

Just this small thing increases the file size to 117 kb! And remember, that's not merely a doubling of the size, as 44 kb of the binary is not put in memory. The memory load has gone from about 10 kb to over 70 kb. What causes this increase? Well, let's see:

         w _Jv_RegisterClasses
         w __deregister_frame_info
         w __gnu_Unwind_Find_exidx
         w __register_frame_info
00080000 N _stack
01000000 A __vectors_end
01000000 A __vectors_start
01000100 A __itcm_start
01000100 000000c8 T irqTable
010001c8 T IntrMain
010001fc t findIRQ
01000218 t no_handler
01000228 t jump_intr
0100023c t got_handler
0100025c t IntrRet
01000290 A __itcm_end
02000000 T __text_start
02000000 T _start
02000194 t ILoop
02000198 t checkARGV
020001dc t .copyforward
020001f0 t .copybackward
02000200 t .copydone
02000214 t ClearMem
02000228 t ClrLoop
02000238 t CopyMemCheck
0200023c t CopyMem
0200024c t CIDLoop
02000300 T _init
02000310 t __do_global_dtors_aux
0200033c t frame_dummy
0200037c 00000004 T main
02000380 00000012 T _Z12test_std_newv
02000394 000000ec T initSystem
02000480 00000012 T ledBlink
02000494 0000002c T powerOff
020004c0 00000030 T powerOn
020004f0 00000018 T systemSleep
02000508 00000010 T powerValueHandler
02000518 00000044 T systemMsgHandler
0200055c 00000164 t fifoInternalSend
020006c0 00000038 T fifoSendAddress
020006f8 00000048 T fifoSendValue32
02000740 00000070 T fifoGetAddress
020007b0 00000074 T fifoSetAddressHandler
02000824 00000070 T fifoGetValue32
02000894 00000074 T fifoSetValue32Handler
02000908 00000024 T fifoCheckAddress
0200092c 00000024 T fifoCheckDatamsg
02000950 00000024 T fifoCheckValue32
02000974 00000094 t fifoInternalSendInterrupt
02000a08 00000010 t __timeoutvbl
02000a18 000001b8 T fifoInit
02000bd0 00000100 T fifoGetDatamsg
02000cd0 0000040c t fifoInternalRecvInterrupt
020010dc 000000a8 T fifoSetDatamsgHandler
02001184 00000070 T fifoSendDatamsg
020011f4 00000002 T irqDummy
020011f8 0000006c T irqSet
02001264 0000004c T irqInit
020012b0 00000030 T irqInitHandler
020012e0 00000060 T irqEnable
02001340 00000060 T irqDisable
020013a0 0000002c T irqClear
020013d0 T swiSoftReset
020013d4 T swiDelay
020013d8 T swiIntrWait
020013dc T swiWaitForVBlank
020013e0 T swiSleep
020013e4 T swiChangeSoundBias
020013e8 T swiDivide
020013ec T swiRemainder
020013f2 T swiDivMod
020013fe T swiCopy
02001402 T swiFastCopy
02001406 T swiSqrt
0200140a T swiCRC16
0200140e T swiIsDebugger
02001412 T swiUnpackBits
02001416 T swiDecompressLZSSWram
0200141a T swiDecompressLZSSVram
0200141e T swiDecompressHuffman
02001422 T swiDecompressRLEWram
02001426 T swiDecompressRLEVram
0200142a T swiWaitForIRQ
0200142e T swiDecodeDelta8
02001432 T swiDecodeDelta16
02001436 T swiSetHaltCR
02001440 00000054 t d_make_comp
02001494 0000003a t d_make_name
020014d0 00000058 t d_number
02001528 0000004c t d_call_offset
02001574 00000096 t d_cv_qualifiers
0200160c 00000060 t d_template_param
0200166c 00000160 t d_substitution
020017cc 00000050 t d_append_char
0200181c 00000084 t d_find_pack
020018a0 00000090 t d_source_name
02001930 00000240 t d_expression
02001b70 0000056c t d_type
020020dc 0000009a t d_bare_function_type
02002178 000000ec t d_operator_name
02002264 00000136 t d_unqualified_name
0200239c 000000ca t d_expr_primary
02002468 000000aa t d_template_args
02002514 0000022c t d_name
02002740 0000039c t d_encoding
02002adc 00000060 t d_exprlist
02002b3c 0000008a t d_growable_string_callback_adapter
02002bc8 00000098 t d_append_buffer
02002c60 000000a0 t d_append_string
02002d00 000001f8 t d_print_array_type
02002ef8 00000108 t d_print_mod_list
02003000 00000234 t d_print_function_type
02003234 00000ba0 t d_print_comp
02003dd4 000001c0 t d_demangle_callback
02003f94 0000002e T __gcclibcxx_demangle_callback
02003fc4 000000c0 T __cxa_demangle
02004084 000000c8 t d_print_mod
0200414c 00000104 t d_print_cast
02004250 0000009c t d_print_expr_op
020042ec 000000a8 t d_print_subexpr
02004398 T __cxa_end_cleanup
020043a4 T __aeabi_uidiv
020043a4 0000007a T __udivsi3
02004420 0000000e T __aeabi_uidivmod
02004430 00000002 T __aeabi_idiv0
02004430 00000002 T __aeabi_ldiv0
02004430 00000002 T __div0
02004434 00000010 t _Unwind_decode_target2
02004444 0000002a T _Unwind_VRS_Get
02004470 0000001a t _Unwind_GetGR
0200448c 0000002a T _Unwind_VRS_Set
020044b8 0000001c t _Unwind_SetGR
020044d4 00000020 t selfrel_offset31
020044f4 00000074 t search_EIT_table
02004568 00000004 T _Unwind_GetCFA
0200456c 00000002 T _Unwind_Complete
02004570 00000016 T _Unwind_DeleteException
02004588 000002bc t __gnu_unwind_pr_common
02004844 0000000e W __aeabi_unwind_cpp_pr2
02004854 0000000e W __aeabi_unwind_cpp_pr1
02004864 0000000e T __aeabi_unwind_cpp_pr0
02004874 000000d0 t get_eit_entry
02004944 0000005a t restore_non_core_regs
020049a0 00000080 T __gnu_Unwind_Backtrace
02004a20 000000e4 t unwind_phase2_forced
02004b04 00000018 T __gnu_Unwind_ForcedUnwind
02004b1c 00000034 t unwind_phase2
02004b50 00000060 T __gnu_Unwind_RaiseException
02004bb0 0000001e T __gnu_Unwind_Resume_or_Rethrow
02004bd0 00000040 T __gnu_Unwind_Resume
02004c10 00000268 T _Unwind_VRS_Pop
02004e80 0000001c T __restore_core_regs
02004e80 0000001c T restore_core_regs
02004e9c T __gnu_Unwind_Restore_VFP
02004ea4 T __gnu_Unwind_Save_VFP
02004eac T __gnu_Unwind_Restore_VFP_D
02004eb4 T __gnu_Unwind_Save_VFP_D
02004ebc T __gnu_Unwind_Restore_VFP_D_16_to_31
02004ec4 T __gnu_Unwind_Save_VFP_D_16_to_31
02004ecc T __gnu_Unwind_Restore_WMMXD
02004f10 T __gnu_Unwind_Save_WMMXD
02004f54 T __gnu_Unwind_Restore_WMMXC
02004f68 T __gnu_Unwind_Save_WMMXC
02004f7c 0000002a T _Unwind_RaiseException
02004f7c 0000002a T ___Unwind_RaiseException
02004fa8 0000002a T _Unwind_Resume
02004fa8 0000002a T ___Unwind_Resume
02004fd4 0000002a T _Unwind_Resume_or_Rethrow
02004fd4 0000002a T ___Unwind_Resume_or_Rethrow
02005000 0000002a T _Unwind_ForcedUnwind
02005000 0000002a T ___Unwind_ForcedUnwind
0200502c 0000002a T _Unwind_Backtrace
0200502c 0000002a T ___Unwind_Backtrace
02005058 00000036 t next_unwind_byte
02005090 00000006 T _Unwind_GetTextRelBase
02005098 00000006 T _Unwind_GetDataRelBase
020050a0 0000001a t _Unwind_GetGR
020050bc 0000000e t unwind_UCB_from_context
020050cc 00000018 T _Unwind_GetLanguageSpecificData
020050e4 0000000e T _Unwind_GetRegionStart
020050f4 000002e8 T __gnu_unwind_execute
020053dc 0000002a T __gnu_unwind_frame
02005408 0000000e T abort
02005418 0000002c T fputc
02005444 00000026 T _fputc_r
0200546c 0000005c T _fputs_r
020054c8 0000001c T fputs
020054e4 00000324 T __sfvwrite_r
0200580c 0000007c T _fwrite_r
02005888 00000028 T fwrite
020058b0 00000030 T __libc_fini_array
020058e0 00000050 T __libc_init_array
02005934 00000018 T free
0200594c 00000018 T malloc
02005964 00000504 T _malloc_r
02005e68 00000080 T memchr
02005ee8 00000058 T memcmp
02005f40 00000080 T memcpy
02005fc0 000000a0 T memmove
02006060 00000094 T memset
020060f4 00000002 T __malloc_lock
020060f8 00000002 T __malloc_unlock
020060fc 00000064 T putc
02006160 0000005e T _putc_r
020061c0 0000001c T realloc
020061dc 00000360 T _realloc_r
0200653c 0000005c T _raise_r
02006598 00000018 T raise
020065b0 00000036 T _init_signal_r
020065e8 00000014 T _init_signal
020065fc 00000056 T __sigtramp_r
02006654 00000018 T __sigtramp
0200666c 00000040 T _signal_r
020066ac 0000001c T signal
020066cc 00000044 T sprintf
02006710 00000040 T _sprintf_r
02006750 0000005c T strcmp
020067ac 0000004c T strcpy
020067f8 0000006c T strlen
02006864 000000ac T strncmp
02006910 00000134 t __sprint_r
02006a44 000015d6 T _svfprintf_r
02008020 00000020 T write
02008040 000000c4 T __swbuf_r
02008104 0000001c T __swbuf
02008120 00000042 T _wcrtomb_r
02008164 00000020 T wcrtomb
02008184 000000da T _wcsrtombs_r
02008260 00000028 T wcsrtombs
02008288 000002c8 T _wctomb_r
02008550 000000d0 T __swsetup_r
02008620 00000154 t quorem
02008774 00000e9c T _dtoa_r
02009610 00000114 T _fflush_r
02009724 00000030 T fflush
02009758 00000002 T __sfp_lock_acquire
0200975c 00000002 T __sfp_lock_release
02009760 00000002 T __sinit_lock_acquire
02009764 00000002 T __sinit_lock_release
02009768 00000004 t __fp_lock
0200976c 00000004 t __fp_unlock
02009770 0000001c T __fp_unlock_all
0200978c 0000001c T __fp_lock_all
020097a8 00000014 T _cleanup_r
020097bc 00000014 T _cleanup
020097d0 0000004c t std
0200981c 0000005c T __sinit
02009878 00000030 T __sfmoreglue
020098a8 00000090 T __sfp
02009938 000000a4 T _malloc_trim_r
020099dc 000001ac T _free_r
02009b88 00000064 T _fwalk_reent
02009bec 0000005c T _fwalk
02009c4c 0000000c T __locale_charset
02009c58 00000008 T _localeconv_r
02009c60 00000008 T localeconv
02009c68 00000254 T _setlocale_r
02009ebc 0000001c T setlocale
02009ed8 000000e8 T __smakebuf_r
02009fc0 0000065e T _mbtowc_r
0200a620 00000016 T _Bfree
0200a638 00000054 T __hi0bits
0200a68c 00000068 T __lo0bits
0200a6f4 00000042 T __mcmp
0200a738 00000050 T __ulp
0200a788 0000009c T __b2d
0200a824 00000064 T __ratio
0200a888 00000044 T _mprec_log10
0200a8cc 00000048 T __copybits
0200a914 00000054 T __any_on
0200a968 00000052 T _Balloc
0200a9bc 000000e4 T __d2b
0200aaa0 00000120 T __mdiff
0200abc0 000000c4 T __lshift
0200ac84 00000164 T __multiply
0200ade8 00000016 T __i2b
0200ae00 000000a4 T __multadd
0200aea4 000000b8 T __pow5mult
0200af5c 0000009c T __s2b
0200aff8 00000024 T __isinfd
0200b01c 00000020 T __isnand
0200b03c 00000010 T __sclose
0200b04c 00000030 T __sseek
0200b07c 0000003c T __swrite
0200b0b8 0000002c T __sread
0200b0e4 0000005c T _calloc_r
0200b140 000000a2 T _fclose_r
0200b1e4 00000018 T fclose
0200b200 0000004c T _close_r
0200b250 00000054 T _fstat_r
0200b2a8 0000000a T _getpid_r
0200b2b4 00000004 T _isatty_r
0200b2b8 0000000a T _kill_r
0200b2c4 0000004c T _lseek_r
0200b314 0000004c T _read_r
0200b364 00000054 T _sbrk_r
0200b3b8 00000006 T _times_r
0200b3c0 0000002c T _gettimeofday_r
0200b3ec 00000014 T _times
0200b400 0000004c T _write_r
0200b450 00000014 T _exit
0200b468 00000052 T build_argv
0200b4bc 00000020 T __get_handle
0200b4dc 0000003c T __alloc_handle
0200b518 0000002c T __release_handle
0200b544 00000014 T setDefaultDevice
0200b558 0000007c T AddDevice
0200b5d4 00000068 T FindDevice
0200b63c 00000020 T GetDeviceOpTab
0200b65c 00000024 T RemoveDevice
0200b680 T __aeabi_idiv
0200b680 00000094 T __divsi3
0200b714 0000000e T __aeabi_idivmod
0200b724 T __aeabi_drsub
0200b72c 00000314 T __aeabi_dsub
0200b72c 00000314 T __subdf3
0200b730 00000310 T __adddf3
0200b730 00000310 T __aeabi_dadd
0200ba40 00000024 T __aeabi_ui2d
0200ba40 00000024 T __floatunsidf
0200ba64 00000028 T __aeabi_i2d
0200ba64 00000028 T __floatsidf
0200ba8c 00000040 T __aeabi_f2d
0200ba8c 00000040 T __extendsfdf2
0200bacc 00000074 T __aeabi_ul2d
0200bacc 00000074 T __floatundidf
0200bae0 00000060 T __aeabi_l2d
0200bae0 00000060 T __floatdidf
0200bb40 00000290 T __aeabi_dmul
0200bb40 00000290 T __muldf3
0200bdd0 0000020c T __aeabi_ddiv
0200bdd0 0000020c T __divdf3
0200bfdc 00000094 T __gedf2
0200bfdc 00000094 T __gtdf2
0200bfe4 0000008c T __ledf2
0200bfe4 0000008c T __ltdf2
0200bfec 00000084 T __cmpdf2
0200bfec 00000084 T __eqdf2
0200bfec 00000084 T __nedf2
0200c070 00000034 T __aeabi_cdrcmple
0200c08c 00000018 T __aeabi_cdcmpeq
0200c08c 00000018 T __aeabi_cdcmple
0200c0a4 00000018 T __aeabi_dcmpeq
0200c0bc 00000018 T __aeabi_dcmplt
0200c0d4 00000018 T __aeabi_dcmple
0200c0ec 00000018 T __aeabi_dcmpge
0200c104 00000018 T __aeabi_dcmpgt
0200c11c 0000005c T __aeabi_d2iz
0200c11c 0000005c T __fixdfsi
0200c178 0000000c T __errno
0200c184 0000000c T _ZdaPv
0200c190 0000004c t _ZL21base_of_encoded_valuehP15_Unwind_Context
0200c1dc 0000016c t _ZL17parse_lsda_headerP15_Unwind_ContextPKhP16lsda_header_info
0200c348 0000073a T __gxx_personality_v0
0200ca84 00000010 T _ZSt13set_terminatePFvvE
0200ca94 00000010 T _ZSt14set_unexpectedPFvvE
0200caa4 00000020 T _ZN10__cxxabiv111__terminateEPFvvE
0200cac4 00000010 T _ZSt9terminatev
0200cad4 0000000c T _ZN10__cxxabiv112__unexpectedEPFvvE
0200cae0 00000010 T _ZSt10unexpectedv
0200caf0 00000018 T _Znaj
0200cb08 0000010e T _ZN9__gnu_cxx27__verbose_terminate_handlerEv
0200cc18 00000010 T _ZdlPv
0200cc28 000000f8 T __cxa_type_match
0200cd20 00000062 T __cxa_begin_cleanup
0200cd84 0000006a T __gnu_end_cleanup
0200cdf0 00000020 T __cxa_bad_typeid
0200ce10 00000020 T __cxa_bad_cast
0200ce30 00000048 T __cxa_call_terminate
0200ce78 00000122 T __cxa_call_unexpected
0200cf9c 00000004 T __cxa_get_exception_ptr
0200cfa0 00000012 T _ZSt18uncaught_exceptionv
0200cfb4 00000086 T __cxa_end_catch
0200d03c 00000086 T __cxa_begin_catch
0200d0c4 0000000c T _ZNSt9exceptionD2Ev
0200d0d0 0000000c T _ZNSt9exceptionD1Ev
0200d0dc 0000000c T _ZNSt13bad_exceptionD2Ev
0200d0e8 0000000c T _ZNSt13bad_exceptionD1Ev
0200d0f4 0000000c T _ZN10__cxxabiv115__forced_unwindD2Ev
0200d100 0000000c T _ZN10__cxxabiv115__forced_unwindD1Ev
0200d10c 0000000c T _ZN10__cxxabiv119__foreign_exceptionD2Ev
0200d118 0000000c T _ZN10__cxxabiv119__foreign_exceptionD1Ev
0200d124 00000008 T _ZNKSt9exception4whatEv
0200d12c 00000008 T _ZNKSt13bad_exception4whatEv
0200d134 00000000 T _ZNKSt13bad_exhelpimtrappedinabinaryfactoryEv
0200d134 0000001c T _ZN10__cxxabiv119__foreign_exceptionD0Ev
0200d150 0000001c T _ZN10__cxxabiv115__forced_unwindD0Ev
0200d16c 0000001c T _ZNSt9exceptionD0Ev
0200d188 0000001c T _ZNSt13bad_exceptionD0Ev
0200d1a4 00000008 T __cxa_get_globals_fast
0200d1ac 00000008 T __cxa_get_globals
0200d1b4 00000068 T __cxa_rethrow
0200d21c 0000005c T __cxa_throw
0200d278 00000034 t _ZL23__gxx_exception_cleanup19_Unwind_Reason_CodeP21_Unwind_Control_Block
0200d2ac 00000026 T __cxa_current_exception_type
0200d2d4 0000001c T _ZN10__cxxabiv123__fundamental_type_infoD1Ev
0200d2f0 0000001c T _ZN10__cxxabiv123__fundamental_type_infoD2Ev
0200d30c 00000020 T _ZN10__cxxabiv123__fundamental_type_infoD0Ev
0200d32c 00000010 T _ZSt15set_new_handlerPFvvE
0200d33c 00000008 T _ZNKSt9bad_alloc4whatEv
0200d344 0000001c T _ZNSt9bad_allocD1Ev
0200d360 0000001c T _ZNSt9bad_allocD2Ev
0200d37c 00000020 T _ZNSt9bad_allocD0Ev
0200d39c 0000006a T _Znwj
0200d408 00000004 T _ZNK10__cxxabiv119__pointer_type_info14__is_pointer_pEv
0200d40c 0000004c T _ZNK10__cxxabiv119__pointer_type_info15__pointer_catchEPKNS_17__pbase_type_infoEPPvj
0200d458 0000001c T _ZN10__cxxabiv119__pointer_type_infoD1Ev
0200d474 0000001c T _ZN10__cxxabiv119__pointer_type_infoD2Ev
0200d490 00000020 T _ZN10__cxxabiv119__pointer_type_infoD0Ev
0200d4b0 00000014 T __cxa_pure_virtual
0200d4c4 0000002e T _ZNK10__cxxabiv120__si_class_type_info11__do_upcastEPKNS_17__class_type_infoEPKvRNS1_15__upcast_resultE
0200d4f4 00000096 T _ZNK10__cxxabiv120__si_class_type_info12__do_dyncastEiNS_17__class_type_info10__sub_kindEPKS1_PKvS4_S6_RNS1_16__dyncast_resultE
0200d58c 00000048 T _ZNK10__cxxabiv120__si_class_type_info20__do_find_public_srcEiPKvPKNS_17__class_type_infoES2_
0200d5d4 0000001c T _ZN10__cxxabiv120__si_class_type_infoD1Ev
0200d5f0 0000001c T _ZN10__cxxabiv120__si_class_type_infoD2Ev
0200d60c 00000020 T _ZN10__cxxabiv120__si_class_type_infoD0Ev
0200d62c 0000000c T _ZNSt9type_infoD2Ev
0200d638 0000000c T _ZNSt9type_infoD1Ev
0200d644 0000000c T _ZNKSt9type_infoeqERKS_
0200d650 00000004 T _ZNKSt9type_info14__is_pointer_pEv
0200d654 00000004 T _ZNKSt9type_info15__is_function_pEv
0200d658 0000000c T _ZNKSt9type_info10__do_catchEPKS_PPvj
0200d664 00000004 T _ZNKSt9type_info11__do_upcastEPKN10__cxxabiv117__class_type_infoEPPv
0200d668 0000001c T _ZNSt9type_infoD0Ev
0200d684 00000008 T _ZNKSt8bad_cast4whatEv
0200d68c 0000001c T _ZNSt8bad_castD1Ev
0200d6a8 0000001c T _ZNSt8bad_castD2Ev
0200d6c4 00000020 T _ZNSt8bad_castD0Ev
0200d6e4 00000008 T _ZNKSt10bad_typeid4whatEv
0200d6ec 0000001c T _ZNSt10bad_typeidD1Ev
0200d708 0000001c T _ZNSt10bad_typeidD2Ev
0200d724 00000020 T _ZNSt10bad_typeidD0Ev
0200d744 0000003e T _ZNK10__cxxabiv117__class_type_info11__do_upcastEPKS0_PPv
0200d784 00000012 T _ZNK10__cxxabiv117__class_type_info20__do_find_public_srcEiPKvPKS0_S2_
0200d798 00000020 T _ZNK10__cxxabiv117__class_type_info11__do_upcastEPKS0_PKvRNS0_15__upcast_resultE
0200d7b8 0000004a T _ZNK10__cxxabiv117__class_type_info12__do_dyncastEiNS0_10__sub_kindEPKS0_PKvS3_S5_RNS0_16__dyncast_resultE
0200d804 00000034 T _ZNK10__cxxabiv117__class_type_info10__do_catchEPKSt9type_infoPPvj
0200d838 0000001c T _ZN10__cxxabiv117__class_type_infoD1Ev
0200d854 0000001c T _ZN10__cxxabiv117__class_type_infoD2Ev
0200d870 00000020 T _ZN10__cxxabiv117__class_type_infoD0Ev
0200d890 00000002 t _GLOBAL__I___cxa_allocate_exception
0200d894 0000003c T __cxa_free_dependent_exception
0200d8d0 0000003c T __cxa_free_exception
0200d90c 00000084 T __cxa_allocate_dependent_exception
0200d990 00000088 T __cxa_allocate_exception
0200da18 00000018 W _ZNK10__cxxabiv117__pbase_type_info15__pointer_catchEPKS0_PPvj
0200da30 00000064 T _ZNK10__cxxabiv117__pbase_type_info10__do_catchEPKSt9type_infoPPvj
0200da94 0000001c T _ZN10__cxxabiv117__pbase_type_infoD1Ev
0200dab0 0000001c T _ZN10__cxxabiv117__pbase_type_infoD2Ev
0200dacc 00000020 T _ZN10__cxxabiv117__pbase_type_infoD0Ev
0200daec T _fini
0200daf8 A __text_end
0200e1dc 000000c4 r standard_subs
0200e2a0 00000280 r cplus_demangle_builtin_types
0200e520 00000350 r cplus_demangle_operators
0200e884 00000004 R _global_impure_ptr
0200e9ec 00000010 r blanks.3548
0200e9fc 00000010 r zeroes.3549
0200ea94 00000030 r lconv
0200eadc 00000048 r JIS_state_table
0200eb24 00000048 r JIS_action_table
0200eb70 000000c8 R __mprec_tens
0200ec38 0000000c r p05.2435
0200ec48 00000028 R __mprec_bigtens
0200ec70 00000028 R __mprec_tinytens
0200ec98 0000005c R dotab_stdnull
0200f4f8 00000014 R _ZTVN10__cxxabiv115__forced_unwindE
0200f510 00000008 R _ZTISt9exception
0200f518 00000014 R _ZTVSt9exception
0200f530 00000008 R _ZTIN10__cxxabiv115__forced_unwindE
0200f538 00000012 R _ZTSSt13bad_exception
0200f54c 00000024 R _ZTSN10__cxxabiv119__foreign_exceptionE
0200f594 00000008 R _ZTIN10__cxxabiv119__foreign_exceptionE
0200f5a0 00000014 R _ZTVSt13bad_exception
0200f5b8 0000000d R _ZTSSt9exception
0200f5c8 00000014 R _ZTVN10__cxxabiv119__foreign_exceptionE
0200f5e0 00000020 R _ZTSN10__cxxabiv115__forced_unwindE
0200f600 0000000c R _ZTISt13bad_exception
0200f60c 00000010 V _ZTIPKe
0200f61c 00000010 V _ZTIPe
0200f62c 00000008 V _ZTIe
0200f634 00000010 V _ZTIPKd
0200f644 00000010 V _ZTIPd
0200f654 00000008 V _ZTId
0200f65c 00000010 V _ZTIPKf
0200f66c 00000010 V _ZTIPf
0200f67c 00000008 V _ZTIf
0200f684 00000010 V _ZTIPKy
0200f694 00000010 V _ZTIPy
0200f6a4 00000008 V _ZTIy
0200f6ac 00000010 V _ZTIPKx
0200f6bc 00000010 V _ZTIPx
0200f6cc 00000008 V _ZTIx
0200f6d4 00000010 V _ZTIPKm
0200f6e4 00000010 V _ZTIPm
0200f6f4 00000008 V _ZTIm
0200f6fc 00000010 V _ZTIPKl
0200f70c 00000010 V _ZTIPl
0200f71c 00000008 V _ZTIl
0200f724 00000010 V _ZTIPKj
0200f734 00000010 V _ZTIPj
0200f744 00000008 V _ZTIj
0200f74c 00000010 V _ZTIPKi
0200f75c 00000010 V _ZTIPi
0200f76c 00000008 V _ZTIi
0200f774 00000010 V _ZTIPKt
0200f784 00000010 V _ZTIPt
0200f794 00000008 V _ZTIt
0200f79c 00000010 V _ZTIPKs
0200f7ac 00000010 V _ZTIPs
0200f7bc 00000008 V _ZTIs
0200f7c4 00000010 V _ZTIPKh
0200f7d4 00000010 V _ZTIPh
0200f7e4 00000008 V _ZTIh
0200f7ec 00000010 V _ZTIPKa
0200f7fc 00000010 V _ZTIPa
0200f80c 00000008 V _ZTIa
0200f814 00000010 V _ZTIPKc
0200f824 00000010 V _ZTIPc
0200f834 00000008 V _ZTIc
0200f83c 00000010 V _ZTIPKDi
0200f84c 00000010 V _ZTIPDi
0200f85c 00000008 V _ZTIDi
0200f864 00000010 V _ZTIPKDs
0200f874 00000010 V _ZTIPDs
0200f884 00000008 V _ZTIDs
0200f88c 00000010 V _ZTIPKw
0200f89c 00000010 V _ZTIPw
0200f8ac 00000008 V _ZTIw
0200f8b4 00000010 V _ZTIPKb
0200f8c4 00000010 V _ZTIPb
0200f8d4 00000008 V _ZTIb
0200f8dc 00000010 V _ZTIPKv
0200f8ec 00000010 V _ZTIPv
0200f8fc 00000008 V _ZTIv
0200f904 00000004 V _ZTSPKe
0200f908 00000003 V _ZTSPe
0200f90c 00000002 V _ZTSe
0200f910 00000004 V _ZTSPKd
0200f914 00000003 V _ZTSPd
0200f918 00000002 V _ZTSd
0200f91c 00000004 V _ZTSPKf
0200f920 00000003 V _ZTSPf
0200f924 00000002 V _ZTSf
0200f928 00000004 V _ZTSPKy
0200f92c 00000003 V _ZTSPy
0200f930 00000002 V _ZTSy
0200f934 00000004 V _ZTSPKx
0200f938 00000003 V _ZTSPx
0200f93c 00000002 V _ZTSx
0200f940 00000004 V _ZTSPKm
0200f944 00000003 V _ZTSPm
0200f948 00000002 V _ZTSm
0200f94c 00000004 V _ZTSPKl
0200f950 00000003 V _ZTSPl
0200f954 00000002 V _ZTSl
0200f958 00000004 V _ZTSPKj
0200f95c 00000003 V _ZTSPj
0200f960 00000002 V _ZTSj
0200f964 00000004 V _ZTSPKi
0200f968 00000003 V _ZTSPi
0200f96c 00000002 V _ZTSi
0200f970 00000004 V _ZTSPKt
0200f974 00000003 V _ZTSPt
0200f978 00000002 V _ZTSt
0200f97c 00000004 V _ZTSPKs
0200f980 00000003 V _ZTSPs
0200f984 00000002 V _ZTSs
0200f988 00000004 V _ZTSPKh
0200f98c 00000003 V _ZTSPh
0200f990 00000002 V _ZTSh
0200f994 00000004 V _ZTSPKa
0200f998 00000003 V _ZTSPa
0200f99c 00000002 V _ZTSa
0200f9a0 00000004 V _ZTSPKc
0200f9a4 00000003 V _ZTSPc
0200f9a8 00000002 V _ZTSc
0200f9ac 00000005 V _ZTSPKDi
0200f9b4 00000004 V _ZTSPDi
0200f9b8 00000003 V _ZTSDi
0200f9bc 00000005 V _ZTSPKDs
0200f9c4 00000004 V _ZTSPDs
0200f9c8 00000003 V _ZTSDs
0200f9cc 00000004 V _ZTSPKw
0200f9d0 00000003 V _ZTSPw
0200f9d4 00000002 V _ZTSw
0200f9d8 00000004 V _ZTSPKb
0200f9dc 00000003 V _ZTSPb
0200f9e0 00000002 V _ZTSb
0200f9e4 00000004 V _ZTSPKv
0200f9e8 00000003 V _ZTSPv
0200f9ec 00000002 V _ZTSv
0200f9f0 0000000c R _ZTIN10__cxxabiv123__fundamental_type_infoE
0200f9fc 00000028 R _ZTSN10__cxxabiv123__fundamental_type_infoE
0200fa28 00000020 R _ZTVN10__cxxabiv123__fundamental_type_infoE
0200fa48 00000014 R _ZTVSt9bad_alloc
0200fa60 0000000d R _ZTSSt9bad_alloc
0200fa70 0000000c R _ZTISt9bad_alloc
0200fa8c 00000001 R _ZSt7nothrow
0200fa90 00000024 R _ZTSN10__cxxabiv119__pointer_type_infoE
0200fab4 0000000c R _ZTIN10__cxxabiv119__pointer_type_infoE
0200fac0 00000024 R _ZTVN10__cxxabiv119__pointer_type_infoE
0200fb08 0000002c R _ZTVN10__cxxabiv120__si_class_type_infoE
0200fb38 0000000c R _ZTIN10__cxxabiv120__si_class_type_infoE
0200fb44 00000025 R _ZTSN10__cxxabiv120__si_class_type_infoE
0200fb6c 00000008 R _ZTISt9type_info
0200fb74 0000000d R _ZTSSt9type_info
0200fb88 00000020 R _ZTVSt9type_info
0200fba8 0000000c R _ZTISt8bad_cast
0200fbb4 0000000c R _ZTSSt8bad_cast
0200fbc0 00000014 R _ZTVSt8bad_cast
0200fbe8 00000014 R _ZTVSt10bad_typeid
0200fc00 0000000c R _ZTISt10bad_typeid
0200fc1c 0000000f R _ZTSSt10bad_typeid
0200fc30 0000002c R _ZTVN10__cxxabiv117__class_type_infoE
0200fc60 0000000c R _ZTIN10__cxxabiv117__class_type_infoE
0200fc6c 00000022 R _ZTSN10__cxxabiv117__class_type_infoE
0200fc90 0000000c R _ZTIN10__cxxabiv117__pbase_type_infoE
0200fc9c 00000022 R _ZTSN10__cxxabiv117__pbase_type_infoE
0200fcc0 00000024 R _ZTVN10__cxxabiv117__pbase_type_infoE
0200ff44 A __exidx_start
02010364 A __exidx_end
02010364 t __frame_dummy_init_array_entry
02010364 A __init_array_start
02010364 A __preinit_array_end
02010364 A __preinit_array_start
0201036c t __do_global_dtors_aux_fini_array_entry
0201036c A __fini_array_start
0201036c A __init_array_end
02010370 r __EH_FRAME_BEGIN__
02010370 A __fini_array_end
02011114 r __FRAME_END__
02011118 d __JCR_END__
02011118 d __JCR_LIST__
0201111c A __data_start
0201111c D __dso_handle
0201111c A __ewram_start
02011120 00000004 D fifo_freewords
02011124 00000004 D fifo_send_queue
02011128 00000004 D fifo_buffer_free
0201112c 00000004 D fifo_receive_queue
02011130 00000004 D _impure_ptr
02011138 00000428 d impure_data
02011560 00000408 D __malloc_av_
02011968 00000004 D __malloc_sbrk_base
0201196c 00000004 D __malloc_trim_threshold
02011970 00000004 d charset
02011974 0000000c d last_lc_ctype.1268
02011980 0000000c D __lc_ctype
0201198c 0000000c d last_lc_messages.1270
02011998 0000000c d lc_messages.1269
020119a4 00000004 D __mb_cur_max
020119a8 00000004 d defaultDevice
020119ac 00000040 D devoptab_list
020119ec 00000004 D _ZN10__cxxabiv119__terminate_handlerE
020119f0 00000004 D _ZN10__cxxabiv120__unexpected_handlerE
020119f4 A __bss_start
020119f4 A __bss_start__
020119f4 A __bss_vma
020119f4 A __data_end
020119f4 A __dtcm_lma
020119f4 A __itcm_lma
020119f4 b completed.2775
020119f8 b object.2787
02011a10 00000004 b __timeout
02011a14 00000004 B processing
02011a18 00000001 b _ZZN9__gnu_cxx27__verbose_terminate_handlerEvE11terminating
02011a1c 0000000c b _ZL10eh_globals
02011a28 00000004 B __new_handler
02011a2c 00000004 b _ZL15dependents_used
02011a30 000001e0 b _ZL17dependents_buffer
02011b84 A __vectors_lma
02011c10 00000004 b _ZL14emergency_used
02011c18 00000800 b _ZL16emergency_buffer
02012418 00000004 B __malloc_top_pad
0201241c 00000028 B __malloc_current_mallinfo
02012444 00000004 B __malloc_max_sbrked_mem
02012448 00000004 B __malloc_max_total_mem
0201244c 00000004 B __nlocale_changed
02012450 00000004 B __mlocale_changed
02012454 00000004 B _PathLocale
02012458 00000004 b heap_start.2602
0201245c 00000004 B fake_heap_end
02012460 00000004 B fake_heap_start
02012464 00000008 B __syscalls
0201246c 00001000 b handles
0201346c 00000004 B theTime
02013470 00000040 B fifo_datamsg_data
020134b0 00000800 B fifo_buffer
02013cb0 00000040 B fifo_value32_func
02013cf0 00000040 B fifo_address_func
02013d30 00000040 B fifo_value32_data
02013d70 00000040 B fifo_value32_queue
02013db0 00000040 B fifo_data_queue
02013df0 00000040 B fifo_address_data
02013e30 00000040 B fifo_datamsg_func
02013e70 00000040 B fifo_address_queue
02013eb0 00000004 B punixTime
02013eb4 A __bss_end
02013eb4 A __bss_end__
02013eb4 A __end__
02013eb4 A _end
023ff000 A __eheap_end
023ff000 A __ewram_end
027fff70 a _libnds_argv
0b000000 A __dtcm_end
0b000000 A __dtcm_start
0b000000 A __sbss_end
0b000000 A __sbss_start
0b000000 A __sbss_start__
0b003d00 A __sp_usr
0b003e00 A __sp_irq
0b003f00 A __sp_svc
0b003ff8 A __irq_flags
0b003ffc A __irq_vector
0b004000 A __dtcm_to

Well, I did say this was going to be the long story, didn't I? Everything that was in the base project is in here as well. The additional parts can summarized as follows.

# Additions w.r.t the base case.
02001440 - 02004398     2F58    : d_* routines
020043a4 - 02004434     0090    : software div (__aeabi_uidiv etc)
02004434 - 02005408     0FD4    : exception unwind routines
02005418 - 0200b680     6268    : various libc : printf et al, malloc,mem*,locale, Device, etc
0200b680 - 0200c184     0B04    : div and FP math routines. (for printf)
0200c190 - 0200daf8     1968    : exception/typeinfo routines.
0200daf8 A __text_end
0200e1dc - 0200ff44     1D68    : exception/typeinfo strings and pointers.
02013eb4 A _end

There are three main areas to discern:

  • d_*() routines, presumably for debug printing. (size: 12k)
  • Stdio formatting and related. This includes file handling, device handling and many forms of printf, which brings a whole lot of bagage (some allocation, format parsing and math/floating point routines). There's also some abort and signalling routines. (size: 28k).
  • Exception handling. Not just routines for handling them, but also the typeinfo stuff required, the output strings and the output string pointers. (size: 18k)

These roughly 60k of stuff is the overhead of exceptions – any potential exception. In this case, it's because new requires a bad_alloc exception when it's unable to allocate more.

The problem is that exceptions have many dependencies: to do exception handling, you keep track and unwind the stack. You also need to be able to tell the type of exception thrown, which requires RTTI. And then you say which exception was thrown, so you need error messages, and a list of pointers to those messages, and a way to format and write those messages, hence the d_*() routines and all the stdio stuff.

3 Custom new/delete

There is a way around this, though: redefine new and related functions. Technically speaking, this is a bad idea if you don't know what you're doing, but it can be done. Note that you would need overload four operators: new, delete and their array counterparts.

void* operator new(size_t size)     {   return malloc(size);    }

void operator delete(void *p)       {   free(p);                }

void* operator new[](size_t size)   {   return malloc(size);    }

void operator delete[](void *p)     {   free(p);                }

This way, you just incur the cost of malloc() and free(), which are only about 3k. But again, this is going against the standard and you'll really have to ask yourself if the (at best) 2% of main RAM you save with this is really worth it.

 
More on this can be read at http://brewforums.qualcomm.com/showthread.php?t=2033.

4 Other considerations and conclusions.

The binary size is not the same as the main RAM footprint. About 44 kb other stuff.

The overhead of the standard new is 60 kb, which is all due to exceptions. You cannot remove it by using the compiler options -fno-exceptions and -fno-rtti, because that only affects your own code, not the standard libraries. You can remove this overhead by using overloading new and related functions, but you have to be really careful with this.

I've also done a little bit of testing with vector, and it seems that vector's overhead also comes from new and can be removed the same way. However, other parts of vector (and STL) may use other exceptions, so it's quite possible it won't work in all cases.

Note that roughly 28 kb of the exception overhead is actually stdio related – specifically formatted printing: *printf. If you're using printf anyway, the effective overhead of exceptions is reduced considerably.

Finally, remember that the exception overhead amounts to roughly 2% of main RAM at most. In most homebrew cases it won't matter that much. When it does start to affect your app, you will likely have other parts that are easier and safer to optimize out.

 
Test project + notes.
 

11 thoughts on “Some new notes on NDS code size

  1. Interesting. And i can read it the other way round, too: if we already have "new" in our program, using std:vector is 60KB "cheaper" than what one would suppose. How would then the size increase if we actually *use* the RTTI system for our own purpose ?

    Btw, I investigated once the meaning of "impure" (http://sylvainhb.blogspot.com/2009/04/impuredata.html), which is basically all the state maintained by the library between two calls (such as the FILE* structures for stdin, stdout and stderr, for instance).

  2. Did you also test the nothrow overloaded version of new? It works like malloc in the sense that it returns NULL on failure instead of producing an exception. Unlike overriding the global new and delete functions it is a standard supported feature, and even though some compilers have been shady with supporting it in the past (In particular MSVC) I believe it works fine in most now. However, to use it I believe you must include so it would be interesting to see if that itself causes bloat.

    Also it's interesting to note that you can overwrite the new and delete operators in a class, and while it can be sketchy to overwrite them instead of using the library provided versions, I wonder if this would provide a way for automatic alignment for certain objects (Setting up the alignment then calling global new or something).

  3. sylvainulg:
    And i can read it the other way round, too: if we already have "new" in our program, using std:vector is 60KB "cheaper" than what one would suppose.

    Yeah, that occurred to me also. I'm not sure what using RTTI yourself would do. Also, thanks for clearing up what impure does.

    Ian:
    Did you also test the nothrow overloaded version of new? It works like malloc in the sense that it returns NULL on failure instead of producing an exception.

    I did try `ptr = new(nothrow) u8[8];', but that still gave me the full overhead. Maybe I didn't do the test right.

  4. That is very odd. Taking a quick look at G++'s man page the option -fno-rtti, g++ claims that it only generates rtti for exceptions as needed (the option itself just gets rid of rtti for dynamic casts and typeid, so if you do not have those in your code could save a little space, and it will error out if you do have them and specify it). new(std::nothrow) should definitely be implemented in such a way that it wont throw any exceptions.

    I have your example program and replaced new with new(std::nothrow) and for extra measure replaced delete[] with ::operator delete[](ptr, std::nothrow) (Should be unnecessary since delete shouldn't throw anyways right?). I actually noticed an increase in final executable size, although I am compiling into x64 not ARM so there may be other factors, still I would have thought the opposite. Assuming both are implemented in such a way that neither throw exceptions I would expect G++ to not compile RTTI and therefore the executable would be shorter. I do not have the time right not to inspect it further, perhaps they have their own set of bloat that is worse than RTTI or perhaps nothrow new in glib calls things that throw exceptions, or something else is causing the size increase independent of these two factors and rtti really isn't being compiled in. I wish I had more time to investigate.

  5. I had more time to look up more information on the nothrow operator. It turns out that even when linking any version of new gcc will always link in the standard exception throwing version of new, and all the rtti with it. So it seems that the only way to avoid the overhead of exception handling from new is to define your own ; ;

  6. I isolated and succesfully took out a few heavy hitters with the following code:

     /** these are heavy guys from the lib i want to strip out. **/
     extern "C" char* _dtoa_r(_reent*, double, int, int, int*, int*, char**) {
       die(__FILE__, __LINE__);
     }
     
     extern "C" char* __cxa_demangle(const char* mangled_name,
     		       char* output_buffer, size_t* length,
     		       int* status) {
       if (status) *status = -2;
       return 0;
     }
     

    Basic idea is that if you provide a __cxa_demangle function in your own program, it overrides the one from the standard library which is responsible for all the d_print_xxx function you can spot. I still have to stress this approach (right now, the provided alternative says "sorry: I couldn't demangle that", which should give you safely mangled names in exception reports. I haven't seen any compiler flag to achieve the same result, but if anyone knows some, i'm interested.

  7. Nice! How did you find these? That is to say, how did you find out these are the ones at the top of it and can safely be removed?

    There are also two others that I found out about recently: __cxa_guard_acquire() and __cxa_guard_release(). They were introduced when I tried to create a local const
    array using data from a global constant instance that used templates. I'm assuming these call __cxa_demangle() at some point, but I can't be sure.

  8. $sylvain> nm -S --demangle runme/arm9/runme.arm9.elf | sort -k 2 | grep '^[0-9a-f]\+ [0-9a-f]\+ [^B] ' --color=always | less -R | grep terminate

    is how I find the "heavy hitters". I know that __cxa_* is run-time support. For _dtoa_r, I was suspicious because the disassembled code featured many calls to builtin_(insert arithmetic function name here)* things, while i'm not doing any FPU things internally. I tried replacing all the *printf with *iprintf and *scanf with *scanf and then I realised that at some point, both function actually called the same internal function that contains the full logic for floating points as well (maybe it's due to vsniprintf and that iprintf would be just fine).

    I then got clued from the library content that "dtoa" is likely "double-to-ascii", and decided to replace it with a "runtime error report" function. So far, it didn't affected the functionality of the code.

    The story for __cxa_demangle was more complicated. Initially, the function i was suspicious about was d_print_comp. Again, i tried disassembling and tracing back "who calls that", but it turned out that noone actually called it (that is, it is a virtual function of some sort, called only through a pointer and the content is statically defined). then i scanned lib*.a for a hit on d_print_comp (DS/dka-r21/arm-eabi/lib/libsupc++.a, if you ask), who revealed the symbol was present from cp-demangle.o, where d_print_comp is a "static" (internal) symbol, and __cxa_demangle is the only "external" symbol. I further googled for information on __cxa_demangle, and found http://idlebox.net/2008/0901-stacktrace-demangled/cxa_demangle.htt, where I found the error codes, full function prototype, etc. I gave "status=-2" a try with a dummy exception, and all of sudden, it reports 16iScriptException to be caught, while the code shrunk by ~100K. Bingo.

    Similarly, i identified __cxa_terminate() which i succesfully replaced using std::set_terminate(my_terminator) who is responsible from handling uncaught exceptions. I don't feel like just abort()ing a DS program, so now when I got that, I fall back to a "press A to return to moonshell, B to download software upgrade" menu.

    Yet, I'm not hot about killing __cxa_guard_acquire() and __cxa_guard_release(). They are required to ensure you got a lock (guard) on the static initialisation. They're defined in guard.o, and from what I see of nm output on libsupc++.a, they rely on throw, unwind, class-type-info, etc., but not __cxa_demangle directly. Even if they would, i did not _remove_ __cxa_demangle here, just replaced it with a "oh, sorry. I cannot demangle that for you. How about showing it raw to the user ?"

  9. Pingback: Articles about __aeabi_idivmod volume 3 « Article Directory

  10. Divorce,drama, loss of a job, health issues about yourself or someone you worry about - these products
    happen. These Do Not Deliver Permanent Results - If at all some pill works
    for you personally, the sad section of the story will likely be
    that it is impact won't last long to suit your needs to savor it.
    The results were dried out skin and mental fuzziness as well as cravings for fatty foods.

Leave a Reply

Your email address will not be published. Required fields are marked *