OSPF is something that still mystifies me. I know it is a vast and large protocol and I do hope I can dispel something today for you.
“Why do other areas need to connect to area 0?”
Picture a tree. Nice big tall trunk and it has many branches. Area 0 is the trunk of your network whilst other areas are in essence branches.
As we know every router in an area shares information about itself and the links it contains. This information is shared with all routers in an area. In turn, each router creates a link-state database. SPF is run on each router in the area and the “tree” is formed.

When areas become large and OSPF areas have a large link state database it is important to break networks into areas like the above. This allows control of the database and ensure efficient convergence. I like to apply areas based on site or geographic boundary where applicable or if I want to leverage distinct LSA’s from certain area types. This areas that connect to area 0 are our “branches”. By defining areas we can limit the SPF calculation to the devices in the area.
Each area is connected to the Area Border Router. ABR’s have an important role in maintaining separate link state databases. It uses type 3 LSAs to inform routers in adjacent areas that it knows how to reach prefixes in other connected areas. It is important to know that ABR’s act as the eyes and ears for routers in other areas. The ABR in the picture (R3) can see the routers R1/R2 in A0 and R4/R5 in A1 and acts as their eyes and ears. R1/R2 and R3/R4 do not know of each other directly.
Inter-area OSPF behaves like a distance vector protocol. Albeit OSPF is a link state protocol, the way OSPF handles inter-area traffic leaves it prone to routing loops. This is why OSPF must connect back to area 0 – to avoid routing loops. Now you can see why network designs that use OSPF all join back to area 0 and why it is important to ensure virtual links are used as band-aids only.
Either a virtual-link or tunnel could be used. Remember that a VL is control plane only while a tunnel would be both control and data plane capable