Want to Join Us ?

you'll be able to discuss, share and send private messages.

Convert string to array of bytes in C

Discussion in 'C' started by Rip Cord, Dec 18, 2014.

Share This Page

  1. Rip Cord

    Administrator Staff Member Admin Developer

    As everyone knows, for a console program commandline arguments are passed to the program as an array of strings.

    C:\>program.exe 123456

    To use 123456 as a numerical data type, library functions are commonly used to convert the string.
    int i;
    i = atoi(argv[1]);
    or
    unsigned long i;
    i = strtoul(argv[1]);

    I couldn't find a library function in native C for converting a string to a byte array. Searching the internet for a method mainly turns up the silly answer that C stores strings as a byte array so there is no need to convert them.
    Here is a simple way to convert a fixed length string to a byte array. I chose a 32 character string to use as a 16 byte key.
    Code (C):

    // convert_to_byte.c : Defines the entry point for the console application.
    //
    //converts a string which is 32 characters long to a byte array
     
    #pragma warning(disable : 4996)
    #include "stdafx.h"
    #include <string.h>
    #include <stdlib.h>
     
     
    int main(int argc, char *argv[])
    {
       
        unsigned char input_string[33]; //a string variable for holding the input string
        unsigned char little_strings[16][3];    //split input string into 16 little strings
        int i;  // i and j are counters for the loops
        int j;
        unsigned __int8 byte_array[32]; //for the array of bytes
     
     
     
        printf("\nconvert_to_byte.exe version 0.1.1\n");
     
        //check number of command line arguments
        if(argc !=2) {
            printf("\n\nusage: %s input_string", argv[0]);
            printf("\ninput_string is a 32 digit string of numbers...\n\n");
            return 0;
        }
     
        //check if length of input number is 32 digits
        if((strlen(argv[1])) != 32) {
            printf("\n\nthe input number must be 32 digits...\n\n");
            return 0;
        }
     
     
        //print argv[1] as string and as characters
        printf("\nargv[1] as\n  string   %s\n  characters ", argv[1]);
        for(i=0;i<32;i++) printf("%c ", argv[1][i]);
     
     
        //copy string from argv[1] to a string 32 characters plus null to terminate the string
        //not necessary, but provides a level of abstraction from the command line
        memcpy(input_string, argv[1], 33);
     
        printf("\n\ninput_string as\n  string    %s\n  characters ", input_string);
        for(i=0;i<32;i++) printf("%c ", input_string[i]);
     
        //copy 2 characters at a time from the input string into 16 little strings
        printf("\n\ncopying 32 characters, 2 at a time, from the string into 16 little strings...");
        for(j=0,i=0; j<16; j++,i+=2 ) {
            little_strings[j][0] = input_string[i]; //2 characters to make 2 digits of byte
            little_strings[j][1] = input_string[i+1];
            little_strings[j][2] = '\0';    //null character terminates a string
        }
     
        //convert array of strings to array of byte values
        //by using the library function strtoul to convert a string to unsigned long
        //and using a type cast to convert unsigned long to byte, an unsigned 8 bit int
        printf("\n\nconverting each little string to a byte...");
        for(j=0;j<16;j++) byte_array[j] = (unsigned __int8)(strtoul(&little_strings[j][0],NULL,16));
     
        //print little strings as strings
        printf("\n\nindex        ");
        for(j=0;j<16;j++) printf(" %2d",j);
        printf("\nlittle_strings");
        for(j=0;j<16;j++) printf(" %s", &little_strings[j][0]);
     
        //print byte array
        printf("\nbyte_array    ");
        for(i=0;i<16;i++) printf(" %.2X", byte_array[i]);
     
        printf("\n\nthe little strings were printed using %%s");
        printf("\nthe bytes were printed using %%X");
     
       
        //for the address of the string
        //which was written: &little_strings[j][0]
        //can also use the short form:  little_strings[j]
     
        printf("\n\n\nfinished...\n\n");
        return 0;
    }
     
    output:
    Code (C):

    C:\>convert_to_byte 0123456789ABCDEF0123456789ABCDEF
     
    convert_to_byte.exe version 0.1.1
     
    argv[1] as
      string     0123456789ABCDEF0123456789ABCDEF
      characters 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
     
    input_string as
      string     0123456789ABCDEF0123456789ABCDEF
      characters 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 1 2 3 4 5 6 7 8 9 A B C D E F
     
    copying 32 characters, 2 at a time, from the string into 16 little strings...
     
    converting each little string to a byte...
     
    index           0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
    little_strings 01 23 45 67 89 AB CD EF 01 23 45 67 89 AB CD EF
    byte_array     01 23 45 67 89 AB CD EF 01 23 45 67 89 AB CD EF
     
    the little strings were printed using %s
    the bytes were printed using %X
     
    finished...
     
     
    Last edited by a moderator: Jul 24, 2015
    storm shadow likes this.
  2. Rip Cord

    Administrator Staff Member Admin Developer

    It's easy to show how silly is the answer that a string is already stored as a byte array. With a couple of lines of code display the address of input_string right after argv[1] is copied to it and pause the program.

    Code (Text):

    memcpy(input_string, argv[1], 33);
    printf("\naddress of input_string is %p", input_string);
    printf("\npress enter to continue");
    getchar();
     
    console output:
    Code (Text):

    C:\convert_to_byte 0123456789ABCDEF0123456789ABCDEF

    address of input_string is 0013FF50
    press enter to continue
     
    While paused, use a hex editor to open ram and look at the memory location pointed to by input_string.

    Code (Text):

    Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
    0013FF50  30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46  0123456789ABCDEF
    0013FF60  30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46  0123456789ABCDEF
     
    or demonstrate the same with these lines of code:
    Code (Text):

        printf("\n\nHere is how c natively stores the input string as hex bytes:\n");
        for(i=0;i<32;i++) printf("%.2X", input_string[i]);
     

    console output:
    Code (Text):

    Here is how c natively stores the input string as hex bytes:
    3031323334353637383941424344454630313233343536373839414243444546
     
    Of course, when a person enters 123456... they want to use the numbers 123456... in calculations, not 303132333435... In order to use a string and have the entered numbers be the actual hex values, it is necessary to convert a string to a byte array even in C.
     
    storm shadow likes this.
  3. sebastiencs

    New Member

    Hello,

    There is no need to use a little_string array or strtoul.
    There is a simpler methode to do that:

    Code (C):

    #include <stdio.h>
    #include <stdint.h>
    #include <string.h>
     
    /*
    **  Convert ASCII to number
    **  'A'  = 65, 'B' = 66, [...], 'F' = 70
    **  '0'  = 48, '1' = 49, [...], '9' = 57
    */

    uint8_t      get_num(char c) {
      return ((c >= 'A' && c <= 'F') ? (c - 'A' + 10) : (c - '0'));
    }
     
    /*
    **  Convert 2 numbers to byte with 2 half byte
    **
    **  binary:  1001 1010
    **  hexa:   =   9    A
    */

    uint8_t      to_byte(char c1, char c2) {
      return (get_num(c1) << 4 | get_num(c2));
    }
     
    int          main(int argc, char *argv[]) {
      size_t        i, j;
      uint8_t      byte[16];
     
      if (argc > 1 && strlen(argv[1]) == 32)
      {
        for (i = 0, j = 0; j < sizeof(byte); i += 2, j += 1)
        {
          byte[j] = to_byte(argv[1][i], argv[1][i + 1]);
        }
        for (i = 0; i < sizeof(byte); i += 1)
        {
          printf((i + 1 != sizeof(byte)) ? ("%02X|") : ("%02X\n"), byte[i]);
        }
      }
      return (0);
    }
     
     
    Last edited: Jun 26, 2015
    Rip Cord and storm shadow like this.
  4. Rip Cord

    Administrator Staff Member Admin Developer

    thanks. that's very nice.
     
  5. sebastiencs

    New Member

    You're welcome.
    I added a few comments to make it more understandable.
     
    Rip Cord and storm shadow like this.
  6. storm shadow

    Techbliss Owner Admin Ida Pro Expert Developer

  7. Rip Cord

    Administrator Staff Member Admin Developer

    minor update, works for any size array of bytes
    though method of sebastiencs is more correct way, this still uses strtoul to do the "heavy lifting" o_O
    Code (C):

    int ishex(char* input)
    {
        uint32_t i;
     
        for(i=0; i<strlen(input); i++) {
            if(!isxdigit(input[i])) return 1;
        }
        return 0;
    }
     
    uint8_t *to_bytes(char* a_string)
    {
        uint8_t *bytes;
        size_t length;
        size_t size;
        uint8_t byte_size_string[3];
        uint32_t i;
     
        bytes = NULL;
        memset(byte_size_string, 0x00, 3);
        length = strlen(a_string);
     
        printf("\n\nhex string:  %s", a_string);
     
        if(length < 2) { printf("\n\nwarning, less than 2 characters not supported\n\n"); return NULL; }
        if(length % 2) { printf("\n\nwarning, %d characters is not an even number\n\n", length); return NULL; }
        if(ishex(a_string)) { printf("\n\nwarning, string is not all hex characters\n\n"); return NULL; }
     
        size = length / 2;
        bytes = (uint8_t *)malloc(size * sizeof(uint8_t));
        if (bytes == NULL) {
            printf("\n\nerror, failed to allocate 0x%X [%u] bytes memory\n\n", size, size);
            return NULL;
        }
     
        //convert string to byte array
        for(i=0;i<size;i++) {
            memcpy(byte_size_string, &a_string[2*i], 2);
            bytes[i] = (uint8_t)strtoul(byte_size_string, NULL, 0x10);
        }
        printf("\nhex bytes:   "); for(i=0;i<size;i++) printf("%.02X", bytes[i]);
     
        printf("\n\nstring length:  %2d", length);
        printf("\narray length:   %2d", size);
     
        return bytes;
    }
     
    source and header
     

    Attached Files:

    Last edited: Apr 17, 2019
    storm shadow likes this.
Top