Why typeof null === "object"?
The task of the unary typeof operator is to return a string representation of the operand type. In other words, typeof 1
will return the string "number"
, and typeof ""
will return "string"
. All possible values of the types returned by the typeof operator are set out in the ECMA-262 - 13.5.1 specification. According to the idea, the value returned by the operator must correspond to the data types accepted in the same specification. However, upon closer examination, it can be noted that typeof null
should return "object"
, despite the fact that Null
is quite an independent type, it is described in section 6.1.2. The reason for this is the usual human factor, or, simply, an innocent error in the code. How this error could happen, let's try to figure out in this article.
Mocha
It is worth starting, perhaps, from the very beginning of JavaScript, namely, the prototype language Mocha, created by Brendan Eich in 1995 in just 10 days, which was later renamed to LiveScript, and even later, in 1996, JavaScript became known to us today.
Unfortunately, the source code of Mocha has been never published and we do not know exactly how it looked back in 1995, however, in the comments to an article on Dr. Alex Rauschmayer's blog, Eich wrote that he used the "Discriminated Union" technique, it is also "Tagged Union", where he used struct
with two fields.
The structure could look like this, for example:
enum JSType { OBJECT, FUNCTION, NUMBER, STRING, BOOLEAN, }; union JSValue { std::string value; // ... other details }; struct TypeOf { JSType type; JSValue values; };
In the same article, Alex Rauschmayer gives an example of the SpiderMonkey engine code (used in Mozilla Firefox) from 1996
JS_PUBLIC_API(JSType) JS_TypeOfValue(JSContext *cx, jsval v) { JSType type = JSTYPE_VOID; JSObject *obj; JSObjectOps *ops; JSClass *clasp; CHECK_REQUEST(cx); if (JSVAL_IS_VOID(v)) { type = JSTYPE_VOID; } else if (JSVAL_IS_OBJECT(v)) { obj = JSVAL_TO_OBJECT(v); if (obj && (ops = obj->map->ops, ops == &js_ObjectOps ? (clasp = OBJ_GET_CLASS(cx, obj), clasp->call || clasp == &js_FunctionClass) : ops->call != 0)) { type = JSTYPE_FUNCTION; } else { type = JSTYPE_OBJECT; } } else if (JSVAL_IS_NUMBER(v)) { type = JSTYPE_NUMBER; } else if (JSVAL_IS_STRING(v)) { type = JSTYPE_STRING; } else if (JSVAL_IS_BOOLEAN(v)) { type = JSTYPE_BOOLEAN; } return type; }
Although the algorithm differs from the original Mocha code, it illustrates the essence of the error well. It just doesn't have a Null
type check. Instead, in the case of val === "null"
, the algorithm gets into the else if (JSVAL_IS_OBJECT(v))
branch and returns JSTYPE_OBJECT
Why "object"?
The fact is that the value of a variable in early versions of the language was a 32-bit unsigned number (uint_32
), where the first three bits indicate the type of the variable. With this scheme, the following values of these first three bits were taken:
000
: object - the variable is a reference to an object001
: int - the variable contains 31-bit integer number010
: double - the variable is a reference to a number with floating point100
: string - the variable is a reference to a sequence of chars110
: boolean - the variable is a boolean value
In turn, Null
was a pointer to a machine nullptr
, which, in turn, looks like 0x00000000
Therefore, checking JSVAL_IS_OBJECT(0x00000000)
returns true
, because the first three bits are 000
, which corresponds to the object
type.
Attempts to fix the bug
Later, this problem was recognized as a bug. In 2006, Eich proposed to deprecate the typeof
operator and replace it with the type()
function, which would take into account, among other things, Null
(an archived copy of the proposal). The function could be built-in or be part of an optional reflection
package. However, in any case, such a fix would not be backward compatible with previous versions of the language, which would create many problems with existing JavaScript code written by developers around the world. It would have required creating a code version checking mechanism and/or custom language options, which did not look realistic.
As a result, the proposal was not accepted, and the typeof
operator in the ECMA-262 specification remained in its original form.
Even later, in 2017, another proposal was put forward Builtin.is and Builtin.typeOf. The main motivation is that the instanceof
operator does not guarantee that the types of variables from different realms are checked correctly. The proposal was not directly related to Null
, however, its text suggested correcting this bug by creating a new Builtin.typeOf()
function. The proposal was also not accepted, because the edge case demonstrated in the motivational part, although not very elegant, can be solved by existing methods.
Modern Null
As I wrote above, the bug appeared in 1995 in the prototype Mocha language, even before the advent of JavaScript itself and until 2006, Brendan Eich did not give up hope of fixing it. However, since 2017, neither the developers nor ECMA have tried to do this anymore. Since then, JavaScript has become much more complex, as have its implementations in popular engines.
SpiderMonkey
There is no trace of the SpiderMonkey code that Alex Rauschmayer published on his blog in 2013. Now the engine (at the time of writing, version FF 121) takes typeof
values from a predefined variable tag
JSType js::TypeOfValue(const Value& v) { switch (v.type()) { case ValueType::Double: case ValueType::Int32: return JSTYPE_NUMBER; case ValueType::String: return JSTYPE_STRING; case ValueType::Null: return JSTYPE_OBJECT; case ValueType::Undefined: return JSTYPE_UNDEFINED; case ValueType::Object: return TypeOfObject(&v.toObject()); #ifdef ENABLE_RECORD_TUPLE case ValueType::ExtendedPrimitive: return TypeOfExtendedPrimitive(&v.toExtendedPrimitive()); #endif case ValueType::Boolean: return JSTYPE_BOOLEAN; case ValueType::BigInt: return JSTYPE_BIGINT; case ValueType::Symbol: return JSTYPE_SYMBOL; case ValueType::Magic: case ValueType::PrivateGCThing: break; } ReportBadValueTypeAndCrash(v); }
Now the engine knows exactly what type of variable is passed to the operator, because after declaring, the variable object contains a bit indicating its type. For Null
, the operator returns the value of JSTYPE_OBJECT
explicitly, as required by the specification
enum JSValueType : uint8_t { JSVAL_TYPE_DOUBLE = 0x00, JSVAL_TYPE_INT32 = 0x01, JSVAL_TYPE_BOOLEAN = 0x02, JSVAL_TYPE_UNDEFINED = 0x03, JSVAL_TYPE_NULL = 0x04, JSVAL_TYPE_MAGIC = 0x05, JSVAL_TYPE_STRING = 0x06, JSVAL_TYPE_SYMBOL = 0x07, JSVAL_TYPE_PRIVATE_GCTHING = 0x08, JSVAL_TYPE_BIGINT = 0x09, #ifdef ENABLE_RECORD_TUPLE JSVAL_TYPE_EXTENDED_PRIMITIVE = 0x0b, #endif JSVAL_TYPE_OBJECT = 0x0c, // This type never appears in a Value; it's only an out-of-band value. JSVAL_TYPE_UNKNOWN = 0x20 };
V8
A similar approach is used in the V8 engine (at the time of writing, version 12.2.165). Here, Null
is the so-called Oddball type, i.e. an object of the Null
type is initialized even before the execution of the JS code, and all subsequent references to the Null
value lead to this single object.
The initializer of the Oddball class looks like this
void Oddball::Initialize(Isolate* isolate, Handle<Oddball> oddball, const char* to_string, Handle<Object> to_number, const char* type_of, uint8_t kind) { STATIC_ASSERT_FIELD_OFFSETS_EQUAL(HeapNumber::kValueOffset, offsetof(Oddball, to_number_raw_)); Handle<String> internalized_to_string = isolate->factory()->InternalizeUtf8String(to_string); Handle<String> internalized_type_of = isolate->factory()->InternalizeUtf8String(type_of); if (IsHeapNumber(*to_number)) { oddball->set_to_number_raw_as_bits( Handle<HeapNumber>::cast(to_number)->value_as_bits(kRelaxedLoad)); } else { oddball->set_to_number_raw(Object::Number(*to_number)); } oddball->set_to_number(*to_number); oddball->set_to_string(*internalized_to_string); oddball->set_type_of(*internalized_type_of); oddball->set_kind(kind); }
In addition to the Isolate zone, a reference to the value of the variable itself and enum
type, it also explicitly takes the values toString
, toNumber
and typeof
, which it will then store inside the class. This allows, when initializing the global heap, to determine the necessary values of these Oddball parameters
// Initialize the null_value. Oddball::Initialize(isolate(), factory->null_value(), "null", handle(Smi::zero(), isolate()), "object", Oddball::kNull);
Here we see that when initializing Null
, the following are passed to the class: toString="null"
, toNumber=0
, typeof="object"
.
The typeof
operator itself simply takes the value through the class getter type_of()
// static Handle<String> Object::TypeOf(Isolate* isolate, Handle<Object> object) { if (IsNumber(*object)) return isolate->factory()->number_string(); if (IsOddball(*object)) return handle(Oddball::cast(*object)->type_of(), isolate); // <- typeof null === "object" if (IsUndetectable(*object)) { return isolate->factory()->undefined_string(); } if (IsString(*object)) return isolate->factory()->string_string(); if (IsSymbol(*object)) return isolate->factory()->symbol_string(); if (IsBigInt(*object)) return isolate->factory()->bigint_string(); if (IsCallable(*object)) return isolate->factory()->function_string(); return isolate->factory()->object_string(); }
EN - https://t.me/frontend_almanac
RU - https://t.me/frontend_almanac_ru
Русская версия: https://blog.frontend-almanac.ru/T6L4f8J6RCa