|
| |
| |
| |
|
|
|
Online members:
abalaban 27 min(s) ago tboeckel 35 min(s) ago
26 guests are online.
You are an Anonymous user. You can register for free by clicking here. |
|
|
| |
| |
| |
| |
| |
| |
UtilityBase needs your
help!
Size does matter: Optimizing with size in mind with GCC by Nicolas Mendoza
Many times, the size of your programs do matter, to save bandwith and resources for you and end-users, or if your target platform has restrictions on resources available. Many compilers have their own settings for optimizing code with size in mind, so does the GNU C/C++ Compiler
There are plenty of ways to optimize code so that your final binaries are as small as possible. Some are to use as few instructions as possible, group common task together, for instance in loops, use shared libraries, removing dead code etc.
GCC already does some effort to remove dead code, optimization against the target processor, and many platforms have support for shared libraries either linked at load time by the OS' loader funtionality, or by explicitly opening files using special APIs provided by the system.
I'll try to go thru a few ways of making your code size shrink without manully altering your code, but letting GCC do the job.
Small, yes, please
The obvious way to get small binaries is to use GCC's own -Os optimization which uses common techniques and a combination of switches to decrease the binary size. It may or may not affect running speed and memory usage though, but it's pretty safe. However there are other ways to shrink code size which are more interesting.
Binary formats
When GCC generates executables it often uses the ELF binary format. ELF is used in most operating systems these days, including AmigaOS 4 and GNU/Linux. Other systems like MS Windows or AmigaOS 3.x use other formats, like COFF or variants of it, or HUNK.
Stripping
ELF (and others) lets binaries contain symbol tables, which are basically a mapping between a symbol name, for instance a function or a variable, and its position (address) in the file itself. These are used as an aid when reporting crashes or for debugging purposes. Some formats support including extensive information, like the DWARF-2 format, making it easy to use a debugger as GDB to step thru code by seeing information only normally available in the source.
Just to visualize the optimizations we use some example code. This is a very simple piece of code with two functions and only one is called:
#include <stdio.h>
void myFunc(void) {
   printf("myFuncn"); }
void myFunc2(void) {
   printf("myFunc2n"); }
int main (void) {
   myFunc();
   return 0; }
|
|
GCC does normally include some symbol information in the binary it builds. To avoid this you can strip the binary by using the strip command. It allows you to strip symbols and sections from files. First let's compile the example:
$ gcc -o example example.c
$ ls -gG example # (-gG no user and group information)
-rwxr-xr-x 1 6738 2007-04-10 07:48 example
|
|
You can list the symbols in a file by using the nm command:
$ nm example
080494b4 d _DYNAMIC
08049588 d _GLOBAL_OFFSET_TABLE_
08048488 R _IO_stdin_used
   w _Jv_RegisterClasses
080494a4 d __CTOR_END__
080494a0 d __CTOR_LIST__
080494ac d __DTOR_END__
080494a8 d __DTOR_LIST__
0804849c r __FRAME_END__
080494b0 d __JCR_END__
080494b0 d __JCR_LIST__
080495ac A __bss_start
080495a0 D __data_start
08048440 t __do_global_ctors_aux
08048320 t __do_global_dtors_aux
080495a4 D __dso_handle
   w __gmon_start__
08048439 T __i686.get_pc_thunk.bx
080494a0 d __init_array_end
080494a0 d __init_array_start
080483c0 T __libc_csu_fini
080483d0 T __libc_csu_init
   U __libc_start_main@@GLIBC_2.0
080495ac A _edata
080495b0 A _end
08048468 T _fini
08048484 R _fp_hw
08048274 T _init
080482d0 T _start
080482f4 t call_gmon_start
080495ac b completed.5758
080495a0 W data_start
08048350 t frame_dummy
0804839c T main
08048374 T myFunc
08048388 T myFunc2
080495a8 d p.5756
   U puts@@GLIBC_2.0
|
|
Let's do a simple strip:
$ strip example
$ ls -gG example
-rwxr-xr-x 1 3032 2007-04-10 07:53 example
|
|
Now, that's over half the size reduction, of course given the small size of the file the space of symbols have more impact, but just stripping your binaries normally give you a good decrease in size.
Mind you, you just lost all kinds of human-readable information in the file which could've been used for debugging or to let your users give you a nice report on a crash or similar. However, you can still safely distribute this stripped file to your users as long as you remember to keep either a generated map file of the exact same build, or save off a non-stripped build, before stripping it. That way back traces from your users will still be easy to lookup using your non-stripped build or the map file. (I won't go more into detail about this here though.)
So, we're still not done stripping. The ELF binary format is organized in sections, the most important being .text and .data sections. You can list the sections of an ELF file and other info using the readelf utility.
$ readelf -S example
There are 27 section headers, starting at offset 0x7a0:
Section Headers:
[Nr] Name      Type     Addr Off Size ES Flg Lk Inf Al
[ 0]        NULL     00000000 000000 000000 00  0 0 0
[ 1] .interp    PROGBITS   08048114 000114 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE     08048128 000128 000020 00 A 0 0 4
[ 3] .hash     HASH     08048148 000148 000028 04 A 5 0 4
[ 4] .gnu.hash   GNU_HASH   08048170 000170 000020 04 A 5 0 4
[ 5] .dynsym    DYNSYM    08048190 000190 000050 10 A 6 1 4
[ 6] .dynstr    STRTAB    080481e0 0001e0 00004a 00 A 0 0 1
[ 7] .gnu.version  VERSYM    0804822a 00022a 00000a 02 A 5 0 2
[ 8] .gnu.version_r VERNEED   08048234 000234 000020 00 A 6 1 4
[ 9] .rel.dyn    REL     08048254 000254 000008 08 A 5 0 4
[10] .rel.plt    REL     0804825c 00025c 000018 08 A 5 12 4
[11] .init     PROGBITS   08048274 000274 000017 00 AX 0 0 4
[12] .plt      PROGBITS   0804828c 00028c 000040 04 AX 0 0 4
[13] .text     PROGBITS   080482d0 0002d0 000198 00 AX 0 0 16
[14] .fini     PROGBITS   08048468 000468 00001c 00 AX 0 0 4
[15] .rodata    PROGBITS   08048484 000484 000017 00 A 0 0 4
[16] .eh_frame   PROGBITS   0804849c 00049c 000004 00 A 0 0 4
[17] .ctors     PROGBITS   080494a0 0004a0 000008 00 WA 0 0 4
[18] .dtors     PROGBITS   080494a8 0004a8 000008 00 WA 0 0 4
[19] .jcr      PROGBITS   080494b0 0004b0 000004 00 WA 0 0 4
[20] .dynamic    DYNAMIC   080494b4 0004b4 0000d0 08 WA 6 0 4
[21] .got      PROGBITS   08049584 000584 000004 04 WA 0 0 4
[22] .got.plt    PROGBITS   08049588 000588 000018 04 WA 0 0 4
[23] .data     PROGBITS   080495a0 0005a0 00000c 00 WA 0 0 4
[24] .bss      NOBITS    080495ac 0005ac 000004 00 WA 0 0 4
[25] .comment    PROGBITS   00000000 0005ac 000126 00  0 0 1
[26] .shstrtab   STRTAB    00000000 0006d2 0000cb 00  0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
|
|
The strip command allows us to strip sections too, and a section that is not necessary (unless you are using it specifically) is the .comment section. So let's strip that off:
$ strip -R.comment example
$ ls -gG example
-rwxr-xr-x 1 2688 2007-04-10 08:03 example
|
|
So, got rid of 300 more bytes, nothing to brag about really. But why not.
Removing dead code
Unreachable code, that is code that will never be executed is normally removed by compilers such as GCC. Some examples are code inside conditional tests that will never happen and code after a return instruction in a function.
Unfortunately GCC, or its linker more correctly, isn't as clever when it comes to removing unused functions and variable definitions in C and if your project is modularized and you are compiling code to objects files and later linked, it doesn't remove unused object declaratons and corresponding code. It does however seem to remove object and method definitions in C++ if they are defined in one file and compiled at one go. If your project is big and you are compiling for a target that only uses parts of modules the following method might be fruitful.
So, how can we make the GCC linker remove code that is not being used in the above-mentioned examples? Well, first off, the reason, the linker can't do this is that it doesn't have the ability to read C/C++. It doesn't parse code, it just links code together to a final binary. Fortunately though there is a way to help the linker find out what is used and not, namely by putting function and data definitions in sections. The linker knows sections and can strip out unused ones easily if told to.
To tell the compiler to put functions and data definitions in own sections, we use the little-known -ffunction-sections and -fdata-sections flags to GCC. Then we tell the linker to garbage collect unused sections with --gc-sections.
First we start off by using the same code as earlier. Go check in the nm output listed above and notice that both myFunc and myFunc2 symbols are in there even though only myFunc was actually used in the program. Now let's try compiling the same program with the above-mentioned tricks:
$ gcc -Wl,--gc-sections -ffunction-sections -fdata-sections -o example example.c
$ nm example | grep myFunc
08048333 T myFunc
|
|
As you see, no myFunc2 symbol and 300 bytes smaller before stripping.
Here is an example with a small C++ "project".
objs.cpp:
#include <iostream> #include "objs.h"
void obj::myFunc(void) {
   std::cout << "myFunc" << 'n'; }
void obj2::myFunc2(void) {
   std::cout << "myFunc2" << 'n'; }
|
|
objs.h:
class obj {
   public:
   void myFunc(void); };
class obj2 {
   public:
   void myFunc2(void); };
|
|
example.cpp:
#include "objs.h"
int main(void) {
   obj* myObj = new obj;
   myObj->myFunc();
   return 0; }
|
|
We compile our little project in a modular fashion and link things together:
$ g++ -c -o objs.o objs.cpp
$ g++ -c -o example.o example.cpp
$ g++ -o example_cpp example.o objs.o
$ ls -gG example_cpp
-rwxr-xr-x 1 8711 2007-04-10 09:36 example_cpp
|
|
As you see by the code, only one of the objects are actually used. Yet, it is present in the binary:
$ nm example_cpp | grep myFunc
080486d6 t _GLOBAL__I__ZN3obj6myFuncEv
0804872e T _ZN3obj6myFuncEv
08048702 T _ZN4obj27myFunc2Ev
|
|
This may look like gibberish, but if you look close enough you see the obj2::myFunc2 function inside there with C++ name mangling. (Symbol names are disambiguated to avoid clashes, for instance two functions with same name, but different types of in parameters.)
We don't want obj2 in our binary at all, so we apply the right magic:
$ g++ -ffunction-sections -fdata-sections -c -o objs.o objs.cpp
$ g++ -ffunction-sections -fdata-sections -c -o example.o example.cpp
$ g++ -Wl,--gc-sections example_cpp example.o objs.o
|
|
And voilá, no obj2::myFunc2 and ~400 bytes off:
$ nm example_cpp | grep myFunc
08048694 t _GLOBAL__I__ZN3obj6myFuncEv
080486c0 T _ZN3obj6myFuncEv
$ ls -gG example_cpp
-rwxr-xr-x 1 8369 2007-04-10 09:38 example_cpp
|
|
This method is not always perfect. If you have a small project, most of your code in a few files, and/or don't compile object files which are later linked, chances are that GCC is able to remove code properly or that the added overhead of having own sections for functions is larger than the size of removed code.
You might benefit of carefully placing the -ffunction-sections/-fdata-sections in the right places too, for instance when compiling files known to contain lots of unused code.
Feel free to comment on the article or correct me if I'm wrong, or you have additions to make.
Further reading
• Managing Code Size
• Library_(computer_science)
• 3.10 Options That Control Optimization
• Optimizing for Space : Measurements and Possibilities for Improvement |
|
| Last poster |
Message |
|
Posted: 2007-May-21 12:16:45
Note that --gc-sections doesn't work with the targets ppc-amigaos, ppc-morphos and m68k-amigaos for now. The reason is either that --emit-reloc is used or that the target binary is not an ELF file. Thanks to Jörg for pointing that out. |
|
|
|