It boils down to three main aspects we’ll call sensing, planning, and acting. Your car recognizes surroundings using computer vision and sensor fusion, then it goes through a localization and path planning process before finally controlling the car to go where it needs to.
To recognize the world around the car, the vehicle is equipped with cameras that give it a collective 360-degree vision. Constantly looking around the whole vehicle the onboard computer uses neural networks to identify the car next to you, the approaching stop sign, the old man crossing the street, or the kid chasing his soccer ball into the road. The cameras are really designed only to get the minimum required information they need for recognition, meaning they can’t see detail. This is where Lidar (light detection and ranging) comes into play, Lidar alongside lasers supplement the camera vision and give a very precise view of the vehicle’s surroundings. Now the whole image has been made.
Planning & Acting
Once this image is gathered, the computer communicates with an onboard GPS that can find the precise location of the car, but only up to about a meter or two meters accuracy. This is paired with high definition maps that further the accuracy to about single-digit centimeter accuracy. The maps have landmarks (like buildings, light posts, etc) that when the camera and lidar vision recognize can be triangulated to a very specific location.
Now that the car can see what obstacles are immediately around it, and it knows precisely where on earth it is, it can decide how to navigate to its particular destination. This is path planning, like your normal GPS maps a route for you to drive the car does this for itself however it creates waypoints or plots a projected path that the car should follow. Then the car controls the actual physical controls of the car to match as best it can the projected path.