<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Some new notes on NDS code size</title>
	<atom:link href="http://www.coranac.com/2009/11/sizeof-new/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.coranac.com/2009/11/sizeof-new/</link>
	<description>my own little world</description>
	<lastBuildDate>Fri, 23 Dec 2011 16:50:19 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
	<item>
		<title>By: Articles about __aeabi_idivmod volume 3 &#171; Article Directory</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-3245</link>
		<dc:creator>Articles about __aeabi_idivmod volume 3 &#171; Article Directory</dc:creator>
		<pubDate>Sun, 13 Jun 2010 14:44:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-3245</guid>
		<description>[...] volume 3   Published on June 13, 2010 in __aeabi_idivmod. 0 Comments Tags: __aeabi_idivmod.      Coranac &#8221; Some new notes on NDS code size When I discussed the memory footprints of several C/C++ elements, I apparently &#8230; 0200b714 [...]</description>
		<content:encoded><![CDATA[<p>[...] volume 3   Published on June 13, 2010 in __aeabi_idivmod. 0 Comments Tags: __aeabi_idivmod.      Coranac &#8221; Some new notes on NDS code size When I discussed the memory footprints of several C/C++ elements, I apparently &#8230; 0200b714 [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sylvainulg</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2826</link>
		<dc:creator>sylvainulg</dc:creator>
		<pubDate>Thu, 17 Dec 2009 16:38:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2826</guid>
		<description>$sylvain&gt; nm -S --demangle runme/arm9/runme.arm9.elf &#124; sort -k 2 &#124; grep &#039;^[0-9a-f]\+ [0-9a-f]\+ [^B] &#039; --color=always &#124; less -R &#124; grep terminate 

is how I find the &quot;heavy hitters&quot;. I know that __cxa_* is run-time support. For _dtoa_r, I was suspicious because the disassembled code featured many calls to builtin_(insert arithmetic function name here)* things, while i&#039;m not doing any FPU things internally. I tried replacing all the *printf with *iprintf and *scanf with *scanf and then I realised that at some point, both function actually called the same internal function that contains the full logic for floating points as well (maybe it&#039;s due to vsniprintf and that iprintf would be just fine).

I then got clued from the library content that &quot;dtoa&quot; is likely &quot;double-to-ascii&quot;, and decided to replace it with a &quot;runtime error report&quot; function. So far, it didn&#039;t affected the functionality of the code.

The story for __cxa_demangle was more complicated. Initially, the function i was suspicious about was d_print_comp. Again, i tried disassembling and tracing back &quot;who calls that&quot;, but it turned out that noone actually called it (that is, it is a virtual function of some sort, called only through a pointer and the content is statically defined). then i scanned lib*.a for a hit on d_print_comp (DS/dka-r21/arm-eabi/lib/libsupc++.a, if you ask), who revealed the symbol was present from cp-demangle.o, where d_print_comp is a &quot;static&quot; (internal) symbol, and __cxa_demangle is the only &quot;external&quot; symbol. I further googled for information on __cxa_demangle, and found http://idlebox.net/2008/0901-stacktrace-demangled/cxa_demangle.htt, where I found the error codes, full function prototype, etc. I gave &quot;status=-2&quot; a try with a dummy exception, and all of sudden, it reports 16iScriptException to be caught, while the code shrunk by ~100K. Bingo.

Similarly, i identified __cxa_terminate() which i succesfully replaced using std::set_terminate(my_terminator) who is responsible from handling uncaught exceptions. I don&#039;t feel like just abort()ing a DS program, so now when I got that, I fall back to a &quot;press A to return to moonshell, B to download software upgrade&quot; menu.

Yet, I&#039;m not hot about killing __cxa_guard_acquire() and __cxa_guard_release(). They are required to ensure you got a lock (guard) on the static initialisation. They&#039;re defined in guard.o, and from what I see of nm output on libsupc++.a, they rely on throw, unwind, class-type-info, etc., but not __cxa_demangle directly. Even if they would, i did not _remove_ __cxa_demangle here, just replaced it with a &quot;oh, sorry. I cannot demangle that for you. How about showing it raw to the user ?&quot;</description>
		<content:encoded><![CDATA[<p>$sylvain&gt; nm -S &#8211;demangle runme/arm9/runme.arm9.elf | sort -k 2 | grep &#8216;^[0-9a-f]\+ [0-9a-f]\+ [^B] &#8216; &#8211;color=always | less -R | grep terminate </p>
<p> is how I find the &#8220;heavy hitters&#8221;. I know that __cxa_* is run-time support. For _dtoa_r, I was suspicious because the disassembled code featured many calls to builtin_(insert arithmetic function name here)* things, while i&#8217;m not doing any FPU things internally. I tried replacing all the *printf with *iprintf and *scanf with *scanf and then I realised that at some point, both function actually called the same internal function that contains the full logic for floating points as well (maybe it&#8217;s due to vsniprintf and that iprintf would be just fine).</p>
<p> I then got clued from the library content that &#8220;dtoa&#8221; is likely &#8220;double-to-ascii&#8221;, and decided to replace it with a &#8220;runtime error report&#8221; function. So far, it didn&#8217;t affected the functionality of the code.</p>
<p> The story for __cxa_demangle was more complicated. Initially, the function i was suspicious about was d_print_comp. Again, i tried disassembling and tracing back &#8220;who calls that&#8221;, but it turned out that noone actually called it (that is, it is a virtual function of some sort, called only through a pointer and the content is statically defined). then i scanned lib*.a for a hit on d_print_comp (DS/dka-r21/arm-eabi/lib/libsupc++.a, if you ask), who revealed the symbol was present from cp-demangle.o, where d_print_comp is a &#8220;static&#8221; (internal) symbol, and __cxa_demangle is the only &#8220;external&#8221; symbol. I further googled for information on __cxa_demangle, and found <a href="http://idlebox.net/2008/0901-stacktrace-demangled/cxa_demangle.htt" rel="nofollow">http://idlebox.net/2008/0901-stacktrace-demangled/cxa_demangle.htt</a>, where I found the error codes, full function prototype, etc. I gave &#8220;status=-2&#8243; a try with a dummy exception, and all of sudden, it reports 16iScriptException to be caught, while the code shrunk by ~100K. Bingo.</p>
<p> Similarly, i identified __cxa_terminate() which i succesfully replaced using std::set_terminate(my_terminator) who is responsible from handling uncaught exceptions. I don&#8217;t feel like just abort()ing a DS program, so now when I got that, I fall back to a &#8220;press A to return to moonshell, B to download software upgrade&#8221; menu.</p>
<p> Yet, I&#8217;m not hot about killing __cxa_guard_acquire() and __cxa_guard_release(). They are required to ensure you got a lock (guard) on the static initialisation. They&#8217;re defined in guard.o, and from what I see of nm output on libsupc++.a, they rely on throw, unwind, class-type-info, etc., but not __cxa_demangle directly. Even if they would, i did not _remove_ __cxa_demangle here, just replaced it with a &#8220;oh, sorry. I cannot demangle that for you. How about showing it raw to the user ?&#8221;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cearn</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2825</link>
		<dc:creator>cearn</dc:creator>
		<pubDate>Thu, 17 Dec 2009 16:01:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2825</guid>
		<description>&lt;i&gt;Nice!&lt;/i&gt; How did you find these? That is to say, how did you find out these are the ones at the top of it and can safely be removed?

There are also two others that I found out about recently: &lt;code&gt;__cxa_guard_acquire()&lt;/code&gt; and &lt;code&gt;__cxa_guard_release()&lt;/code&gt;. They were introduced when I tried to create a local const 
array using data from a global constant instance that used templates. I&#039;m assuming these call &lt;code&gt;__cxa_demangle()&lt;/code&gt; at some point, but I can&#039;t be sure.</description>
		<content:encoded><![CDATA[<p><i>Nice!</i> How did you find these? That is to say, how did you find out these are the ones at the top of it and can safely be removed?</p>
<p> There are also two others that I found out about recently: <code>__cxa_guard_acquire()</code> and <code>__cxa_guard_release()</code>. They were introduced when I tried to create a local const<br />
 array using data from a global constant instance that used templates. I&#8217;m assuming these call <code>__cxa_demangle()</code> at some point, but I can&#8217;t be sure.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sylvainulg</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2814</link>
		<dc:creator>sylvainulg</dc:creator>
		<pubDate>Wed, 16 Dec 2009 11:14:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2814</guid>
		<description>I isolated and succesfully took out a few heavy hitters with the following code:
&lt;pre&gt;
/** these are heavy guys from the lib i want to strip out. **/
extern &quot;C&quot; char* _dtoa_r(_reent*, double, int, int, int*, int*, char**) {
  die(__FILE__, __LINE__);
}

extern &quot;C&quot; char* __cxa_demangle(const char* mangled_name,
		       char* output_buffer, size_t* length,
		       int* status) {
  if (status) *status = -2;
  return 0;
}
&lt;/pre&gt;

Basic idea is that if you provide a __cxa_demangle function in your own program, it overrides the one from the standard library which is responsible for all the d_print_xxx function you can spot. I still have to stress this approach (right now, the provided alternative says &quot;sorry: I couldn&#039;t demangle that&quot;, which should give you safely mangled names in exception reports. I haven&#039;t seen any compiler flag to achieve the same result, but if anyone knows some, i&#039;m interested.</description>
		<content:encoded><![CDATA[<p>I isolated and succesfully took out a few heavy hitters with the following code:</p>
<pre>
 /** these are heavy guys from the lib i want to strip out. **/
 extern "C" char* _dtoa_r(_reent*, double, int, int, int*, int*, char**) {
   die(__FILE__, __LINE__);
 }

 extern "C" char* __cxa_demangle(const char* mangled_name,
 		       char* output_buffer, size_t* length,
 		       int* status) {
   if (status) *status = -2;
   return 0;
 }
 </pre>
<p> Basic idea is that if you provide a __cxa_demangle function in your own program, it overrides the one from the standard library which is responsible for all the d_print_xxx function you can spot. I still have to stress this approach (right now, the provided alternative says &#8220;sorry: I couldn&#8217;t demangle that&#8221;, which should give you safely mangled names in exception reports. I haven&#8217;t seen any compiler flag to achieve the same result, but if anyone knows some, i&#8217;m interested.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2631</link>
		<dc:creator>Ian</dc:creator>
		<pubDate>Thu, 26 Nov 2009 06:12:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2631</guid>
		<description>I had more time to look up more information on the nothrow operator. It turns out that even when linking any version of new gcc will always link in the standard exception throwing version of new, and all the rtti with it. So it seems that the only way to avoid the overhead of exception handling from new is to define your own ; ;</description>
		<content:encoded><![CDATA[<p>I had more time to look up more information on the nothrow operator. It turns out that even when linking any version of new gcc will always link in the standard exception throwing version of new, and all the rtti with it. So it seems that the only way to avoid the overhead of exception handling from new is to define your own ; ;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2629</link>
		<dc:creator>Ian</dc:creator>
		<pubDate>Wed, 25 Nov 2009 20:17:06 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2629</guid>
		<description>&lt;p&gt;
That is very odd. Taking a quick look at G++&#039;s man page the option -fno-rtti, g++ claims that it only generates rtti for exceptions as needed (the option itself just gets rid of rtti for dynamic casts and typeid, so if you do not have those in your code could save a little space, and it will error out if you do have them and specify it). new(std::nothrow) should definitely be implemented in such a way that it wont throw any exceptions.
&lt;/p&gt;
&lt;p&gt;
I have your example program and replaced new with new(std::nothrow) and for extra measure replaced delete[] with ::operator delete[](ptr, std::nothrow) (Should be unnecessary since delete shouldn&#039;t throw anyways right?). I actually noticed an increase in final executable size, although I am compiling into x64 not ARM so there may be other factors, still I would have thought the opposite. Assuming both are implemented in such a way that neither throw exceptions I would expect G++ to not compile RTTI and therefore the executable would be shorter. I do not have the time right not to inspect it further, perhaps they have their own set of bloat that is worse than RTTI or perhaps nothrow new in glib calls things that throw exceptions, or something else is causing the size increase independent of these two factors and rtti really isn&#039;t being compiled in. I wish I had more time to investigate.
&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>
 That is very odd. Taking a quick look at G++&#8217;s man page the option -fno-rtti, g++ claims that it only generates rtti for exceptions as needed (the option itself just gets rid of rtti for dynamic casts and typeid, so if you do not have those in your code could save a little space, and it will error out if you do have them and specify it). new(std::nothrow) should definitely be implemented in such a way that it wont throw any exceptions.
 </p>
<p>
 I have your example program and replaced new with new(std::nothrow) and for extra measure replaced delete[] with ::operator delete[](ptr, std::nothrow) (Should be unnecessary since delete shouldn&#8217;t throw anyways right?). I actually noticed an increase in final executable size, although I am compiling into x64 not ARM so there may be other factors, still I would have thought the opposite. Assuming both are implemented in such a way that neither throw exceptions I would expect G++ to not compile RTTI and therefore the executable would be shorter. I do not have the time right not to inspect it further, perhaps they have their own set of bloat that is worse than RTTI or perhaps nothrow new in glib calls things that throw exceptions, or something else is causing the size increase independent of these two factors and rtti really isn&#8217;t being compiled in. I wish I had more time to investigate.
 </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cearn</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2626</link>
		<dc:creator>cearn</dc:creator>
		<pubDate>Wed, 25 Nov 2009 19:00:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2626</guid>
		<description>&lt;blockquote&gt;
&lt;b&gt;sylvainulg&lt;/b&gt;:&lt;br /&gt;
And i can read it the other way round, too: if we already have &quot;new&quot; in our program, using std:vector is 60KB &quot;cheaper&quot; than what one would suppose.
&lt;/blockquote&gt;
&lt;p&gt;
Yeah, that occurred to me also. I&#039;m not sure what using RTTI yourself would do. Also, thanks for clearing up what &lt;code&gt;impure&lt;/code&gt; does.
&lt;/p&gt;

&lt;blockquote&gt;
&lt;b&gt;Ian&lt;/b&gt;:&lt;br /&gt;
Did you also test the nothrow overloaded version of new? It works like malloc in the sense that it returns NULL on failure instead of producing an exception.
&lt;/blockquote&gt;
&lt;p&gt;
I did try `&lt;code&gt;ptr = new(nothrow) u8[8];&lt;/code&gt;&#039;, but that still gave me the full overhead. Maybe I didn&#039;t do the test right.
&lt;/p&gt;</description>
		<content:encoded><![CDATA[<blockquote><p>
 <b>sylvainulg</b>:<br />
 And i can read it the other way round, too: if we already have &#8220;new&#8221; in our program, using std:vector is 60KB &#8220;cheaper&#8221; than what one would suppose.
 </p></blockquote>
<p>
 Yeah, that occurred to me also. I&#8217;m not sure what using RTTI yourself would do. Also, thanks for clearing up what <code>impure</code> does.
 </p>
<blockquote><p>
 <b>Ian</b>:<br />
 Did you also test the nothrow overloaded version of new? It works like malloc in the sense that it returns NULL on failure instead of producing an exception.
 </p></blockquote>
<p>
 I did try `<code>ptr = new(nothrow) u8[8];</code>&#8216;, but that still gave me the full overhead. Maybe I didn&#8217;t do the test right.
 </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2625</link>
		<dc:creator>Ian</dc:creator>
		<pubDate>Wed, 25 Nov 2009 16:45:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2625</guid>
		<description>Did you also test the nothrow overloaded version of new? It works like malloc in the sense that it returns NULL on failure instead of producing an exception. Unlike overriding the global new and delete functions it is a standard supported feature, and even though some compilers have been shady with supporting it in the past (In particular MSVC) I believe it works fine in most now. However, to use it I believe you must include  so it would be interesting to see if that itself causes bloat. 

Also it&#039;s interesting to note that you can overwrite the new and delete operators in a class, and while it can be sketchy to overwrite them instead of using the library provided versions, I wonder if this would provide a way for automatic alignment for certain objects (Setting up the alignment then calling global new or something).</description>
		<content:encoded><![CDATA[<p>Did you also test the nothrow overloaded version of new? It works like malloc in the sense that it returns NULL on failure instead of producing an exception. Unlike overriding the global new and delete functions it is a standard supported feature, and even though some compilers have been shady with supporting it in the past (In particular MSVC) I believe it works fine in most now. However, to use it I believe you must include  so it would be interesting to see if that itself causes bloat. </p>
<p> Also it&#8217;s interesting to note that you can overwrite the new and delete operators in a class, and while it can be sketchy to overwrite them instead of using the library provided versions, I wonder if this would provide a way for automatic alignment for certain objects (Setting up the alignment then calling global new or something).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sylvainulg</title>
		<link>http://www.coranac.com/2009/11/sizeof-new/comment-page-1/#comment-2620</link>
		<dc:creator>sylvainulg</dc:creator>
		<pubDate>Tue, 24 Nov 2009 16:59:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.coranac.com/?p=133#comment-2620</guid>
		<description>Interesting. And i can read it the other way round, too: if we already have &quot;new&quot; in our program, using std:vector is 60KB &quot;cheaper&quot; than what one would suppose. How would then the size increase if we actually *use* the RTTI system for our own purpose ?

Btw, I investigated once the meaning of &quot;impure&quot; (http://sylvainhb.blogspot.com/2009/04/impuredata.html), which is basically all the state maintained by the library between two calls (such as the FILE* structures for stdin, stdout and stderr, for instance).</description>
		<content:encoded><![CDATA[<p>Interesting. And i can read it the other way round, too: if we already have &#8220;new&#8221; in our program, using std:vector is 60KB &#8220;cheaper&#8221; than what one would suppose. How would then the size increase if we actually *use* the RTTI system for our own purpose ?</p>
<p> Btw, I investigated once the meaning of &#8220;impure&#8221; (<a href="http://sylvainhb.blogspot.com/2009/04/impuredata.html" rel="nofollow">http://sylvainhb.blogspot.com/2009/04/impuredata.html</a>), which is basically all the state maintained by the library between two calls (such as the FILE* structures for stdin, stdout and stderr, for instance).</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!--
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
</head>
<body>
<p>
My database has called in sick. Please imagine some 
annoying elevator tune till he gets back.
</p>
<p>
<small>[[Doo-di-doo tooo. Dum-di-dum-di-doo-dooo.]]</small>
</p>
</body>
</html>

-->
