I think this post raises a great question about choosing the appropriate datatypes.
You might think it would easy to choose a certain data type for a certain variable but there are lot of things that can possibly influence this decisions.
Let me start off by saying , Under the right circumstances you can use any of the data types for gender. But there are certain problems with each case.
( Do note : I may use extreme cases to illustrate certain points)
Let us consider each case one by one
1. Boolean
Now of course since in this classification of gender, there are only two categories male and female so we can store it as boolean variable b
and say that a true in b means male and a false value means female (or vice versa ) (you can think of this like b being the answer or value of the statement
" You are a male ")
Now the advantage of this is that it is very storage efficient i.e it takes only one bit ( as usually truth values are stored as 1 and false as 0 in memory)
but the problem is that it is non-intuitive ( it raises the question why should true values refer to males or why should it refer to female )
and it is also non-expandable ( like you can have only two different values ) so you cant store a new value of gender like "other"
2. Integer
similarly to boolean you can say that when you store a certain integer ( 0 for example ) it is male and for another certain integer ( 1 for example ) it is female.
This has problems similar to that of boolean also. It is non-intuitive which number should be male and which should be female and although it is expandable ( you can include more genders by using the other numbers 2,3,4 etc ) it usually takes up more space than a character data type,
and its great expandability isn't that useful ( it will be unreasonable to think of having 100 different genders )
3. Character
Now i feel this is the most interesting one because it is the only one whose stored value is intuitive ( like you can expect a reasonable person to figure out that a stored value 'M' would represent Males and 'F' would mean it is Female)
Further more since you can store more than 2 different values in character , it is also expandable . But one disadvantage is that it will take more space than boolean. But we choose this because that intuition is important. Now you could say that for greatest intuition you could directly store the value as a string
like "Male" or "Female" but that doesnt really increase the intuitiveness or simplicity of the code that much and it greatly decreases the storage efficiency.
Good code/ Program would be used for years to come and it would most need a few modifications here and there ( like maybe the addition of an 'Other' option as 'O' ) and often the person modifying the code wont be the same as the one who wrote it. So for him/her it will be hard to figure out the stuff that is non-intuitive / unreadable . Good code is ( relatively ) easy to read and comprehend and hence modifying it wouldn't be much of a hassle. Your code wont be useful for long even if it is very efficient but very hard to read.
and usually the priority of a programmer (in expressing computation )is as follows :
1. Correctness ( of out put and program)
2. Simplicity
3. Efficiency
While coding even though there are lots of possibilities, we stick to a certain convention while writing to code to avoid confusions and misunderstandings,
Another factor is the storage as i mentioned , the difference in used storage would not be apparent to us, and often times it does not make a difference in
our normal programs. But real world programs would often have to deal with hundreds of thousands of records / variables like these .So , the storage efficiency would make a great difference there . ( for example in C++ the programming language I'm most familiar with, the size of bool is 1 bit and the size of character data type is 8 bits or 1 byte and that of integer is 4 bytes or 32 bits by default. 1 byte = 8 bits )
Note : I don't have much training or practice related to real world programming , these are my Ideals and thoughts that are guided by the programming principles and ideals mentioned in " Programming: Principles and Practices Using C++" By Bjarne Stroustrupp , the creator of C++ [ second edition]
Any advice/correction regarding my thoughts / ideas are gladly welcomed
Thank You,
-Dev Raj R