Translating a .txt

Started by
6 comments, last by Byter 3 years, 11 months ago

I want to programm a simple translator that can translate from english to german and vice versa. For that I Need a file that contains for example a german word and then possible english translations. It could be structured like that:

german word - english translation 1, english translation 2

laufen - to run, to go, …

However I can't seem to find such files. However I found .txt files that contain a vocabulary list sorted by alphabet one for german and one for english and I have Google translator.

So my idea would be to write Programm (preferably in c++, because that's the language I usually use) to type a single word from the vocab-list into Google translate, read the translation(s) from the html file, save the translation(s) and type the next one in and so on.

So my question is how can I do that? How can I open a Website with c++ and get it's Code now?

Advertisement

I suspect one of the reasons none exists is because translators also need to deal with differences in grammar, as well as languages unhelpfully having words that are entirely dependent on context (left, right, bat, etc.).

Google provides a translation API as a service, which you can use. https://cloud.google.com/translate/docs
As with many cloud providers, they provide SDK's for various client languages (but I am not sure if the C++ one includes translation services, I believe https://github.com/googleapis/google-cloud-cpp ), or if you want you can make the HTTP requests with whichever library (or raw sockets, but a lot of effort) you like.

Also, normally, a language has several words for the almost the same thing, but different emphasis or different intention. English “walk” and “run” both imply you're moving by your own feet, but in the later case you are in much more of a hurry, either because you're late, or because you move faster. “go” doesn't really need to be moving on your own feet (a car would do too), but it's more emphasis on the act of leaving.

Preserving that emphasis or intention across a translation can be really hard.

Ups I thought I had already answered a few hours ago…

The translator I want to make is fairly simple. I don’t intend to give “complicated” stuff like two words at once.

It should just give you some possible translations of a word (don't Need to be all). As @Alberth said accurate translation is difficult and for a hobby project totally out of reach.

My problem at the moment is mostly on getting the data for translation. I need word and translation somehow arranged in a way that a program can read it easily from a file (most likely a .txt) for example as I showed in the main comment. But I didn’t find such a datasheet on the internet so I will have to make one by myself.

What I was able to find however is a list with vocabulary from a language (English and German, there are more on the internet but translating eng -> de and back is more than enough). For example for German I found a list with around 200.000 words (without inflections, plurals, etc…) and a list with around 600.000 words (with inflections, plurals, etc…). But I need to translate them in such a way that the order of translations is the same as the original words. To order them is not the problem but translation itself.

To translate 200.000 words by yourself is way too long, so I thought writing a program that does that is way easier. The words within the text-files are arranged with one word per line and without anything else (no space or similar). So I can easily read them with a stream reader. Furthermore I thought that I could somehow use Google translator to get the translations as this translator already has a big “vocabulary” for translation.

I’ve played a bit with Google Translator and found out two things that help me a lot (which however might seem obvious for you):

-The text that needs to be translated is written within the url → so I can build the url myself in order to insert the word I need to translate

-I found where the translations are saved within the .html so I can read these easily from that.

Now my problem however is that I don’t know how to access the .html for the corresponding urls.

I tried Curl but the “easy” methods don’t work. I get the error message LNK2019 in Visual Studio 2019 and can’t seem to fix it. I looked into easy.h where they are also used and even there VS sais that it can’t find any method called like this.

https://opendata.stackexchange.com/questions/1390/german-english-dictionary

What your trying to do with the web page is probably against the terms of service. Use the provided API and ideally SDK and you won't need to mess with HTML and probably JavaScript enabled pages.

Or look at one of the competing offerings, I believe even some open source tools/data around.

Thanks for the answers I'll try the Dictionary @NikiTo and if that not works I will look into that API which @syncviews mentioned

This topic is closed to new replies.

Advertisement