"All implementations of a function should return the same results given the same inputs"

edit

@DVrandecic (WMF): This does not solve all the things.

  1. Firstly it assumes all functions operates normally. For example if you define a function a+b, there will be integer overflow in C++; in JavaScript there will not be overflow, but precision may be lost (as integer are stored as double-precision floating-point number; none of them will happen in Python. So before we said "equivalent", we need to define the scope of parameters.
  2. Also, some implementations have no hard scope, but it is obviously impossible to calculate 1000000000+1000000000 if we define a+b as (a+1)+(b-1).
  3. For floating-point calculation there may be accuracy problems in results. For example two implementation may use different method to calculate pi and get different results.
  4. Some algorithms usually uses some random in solution. The result may also differ in multiple executions. Dome problems have no unique solutions, such as a function to convert number to English.

Therefore we may need to introduce a higher level of abstraction such as "interface" to cover similar but not equivalent implementations.--GZWDer (talk) 14:36, 20 June 2021 (UTC)Reply

I agree: there are different use cases for a reply, such as different levels of accuracy, completeness, coverage (notably for statistics and measurements, as well as most mathematical functions). Increasing the accuracy or completeness or coverage will require more resource: CPU cycles, time/delay to replay, storage space, energy. So different costs. And with increased delay, accuracy may also increase or decrease (e.g. what will be the weather this afternoon, what is the current best-selling rock star in that country?)
Cost is something that is evaluated with different factors. And if a reply cannot be given immediately, the use case is lost and we loose energy trying to reach the defiunitive response, when most of the time we just need a reasonable estimate (which may be refined later, i.e. the reply will continue to evolve over time, meaning that the current result of functions is not unique, it is normally unique only within given time/resource constraints to get it.
And even if a definitive response is not immediately available (or not fully checked using alternate implementations that also need to take their own computation time and resources), we should have a cache still holding the past results, and if we use these results (e.g. in a composition), the error margins of estimations, and freshness of the cached result should be also part of the response.
So in general we will not get a single response, but many (just like in web search engines) ordered by some ratings and many other coming later (which may be better rated), including for mathematical functions.
And at start we will not know at all which impelemtnation will be the best or the fastest, or the one with enough available resources to start computing some interesting results or estimations. As well, when an implementation will start running it should be able to give regular estimates of its own progresses, and estimation of time (n seconds per completeness-percentage; this time estimation could also vary over time (just look at the completion time estimation made by Windows Explorer when you copy lot of files: this is not a flat curve, it has many pikes
This happens in fact in almost every software and OSes, but as well in all human-driven activities, and in nature in general: we have unpredicable "randomness" because we don't know all the variables at start, but as well some intrinsic randomness everywhere in physics with visible effects: the effet of entropy or temporature which properties emerging from the multiplication of items and variables; entropy is not conservative, its evolution is then not fully predictable). Entropic effects will naturally occur in all human knowledge and in human language. It is a good thing because it allows "creativity" and local changes, our world is not universally flat, and we all have a different vision of it (and "truth" or "correctness" is difficult to assess, we always have margins of errors in all replies, so Wikifunctions will naturally have to accept such margins of errors, but will focus jsut on reducing them, using heuristics rather than algoriothms most of the time, and these heuristic may also fail to give any result in a reasonnable time, so it will have to return "timeouts"; Wikifunctions could then propose to the asker to provide more resources that wikiemdia does not have, but resources will be limited for everyone: no one has inifinite energy or money to spend to get a definitive result whose late benefit will not reach the level of resources spent to get it).
So we must accept variability. Our goal will just be to tune the system so that it will optimize its resources to get a bit more results or precision.
for this reason, I would not call them "functions evaluators", but "function estimators". Their result is not a single value but a list of results qualified by ranking factors, and the costs alrealy spent to get these responses. verdy_p (talk) 17:39, 21 June 2021 (UTC)Reply
@GZWDer: These are great points! Here a few thoughts:
  1. "For example if you define a function a+b, there will be integer overflow in C++; in JavaScript there will not be overflow, but precision may be lost" - you are right. For these cases we should define our Types accordingly. If we are to use the built in integer addition from JavaScript, or the built in addition from C++, we must make sure that the translation to the types in the given languages is correct. E.g. if you define, say, positive integer to be only from 0 to say 4 billion, or something like that. So, yes, you are right, but those are things we need to take account when creating our functions. The main idea still stands: all implementations must return the same results, or else they are erroneous.
  2. "Also, some implementations have no hard scope, but it is obviously impossible to calculate 1000000000+1000000000 if we define a+b as (a+1)+(b-1)." Yeah, some implementations will run out of compute or space. That's something we need to deal with independently of what exactly they encode.
  3. "For floating-point calculation there may be accuracy problems in results. For example two implementation may use different method to calculate pi and get different results." Oh, that's a good point and something I forgot to consider. For testers I had a solution: the testing result will be checked by a function and thus would also be able to consider accuracy. For synthetic results which are not testers... hmm. Good point. It would be good to allow for some form of epsilon or something else. Otherwise this might not be realistic. Good point, that's an open question. I filed a phabricator task for that.
  4. "Some algorithms usually uses some random in solution. The result may also differ in multiple executions. Dome problems have no unique solutions, such as a function to convert number to English." For the beginning we will intentionally not work with non-deterministic algorithms. We will revisit that once the deterministic ones work.
  5. "Therefore we may need to introduce a higher level of abstraction such as "interface" to cover similar but not equivalent implementations." The idea is that the Function is the interface, and the Implementation is the implementation. I hope that makes sense.
Again, thanks, great points! --DVrandecic (WMF) (talk) 21:56, 25 June 2021 (UTC)Reply
@Verdy p: I like the idea of "function estimators" (although I'd call them differently), which return a ranked list of possible answers depending on the context and how much resources are available. For example, one could ask "How many e's are there in English Wikipedia?", a second system could just take the size of Wikipedia and assume the English letter frequency and provide an answer, another might take a sample of hundred random articles, count, and then multiply with 6,500, a third could actually run a map-reduce over all articles and count. And depending on available resources we can run 1, 2, or 3, or any combination.
I wouldn't think of that as the usual functions though. The ones I have in mind are much simpler, and such an approach would rarely be necessary for the functions I am thinking of. But once we scale further, we will get to more and more functions where a "run a number of heuristics and rank the results" will become more and more interesting.
For that case, I hope, Wikifunctions will be flexible enough to allow for a type that would encapsulate such function calls, and to introduce new types akin to functions and implementations that would work in the way you describe.
I really like how far you are pushing these ideas - that's great! And I think you are right that we will need an infrastructure that supports them. For now, we are still dealing with simpler and more basic issues, but if all goes well, we will run exactly into that situation you describe. And I hope that the system will be flexible enough to accommodate them out of the box.
Thank you for those fascinating thoughts! --DVrandecic (WMF) (talk) 21:56, 25 June 2021 (UTC)Reply
May be a version returning a single result would in fact be an iterator: the return value is a composite object containing the 1st result, along with metadata, such that if there are other results, and how many can be retrieved at once; the input parameters could contain also a reference this object (nil by default), or capabilities (such as if the client can accept a list of results, and how many it accepts; the evaluator/estimator can then use this context as a hint about how to process the query, and if the client accepts a list of results, it would return such a list of results;: the list is not necessarily fully ordered, but the input parameter can specify an order, a starting point, just like we can browse through web search results, or browse pages listed in a Wiki catagory. But the client could order the results itself, saving lot of resources on the server of the evaluator/estimator.
As well metadata in the result can contain rankings, completion level, estimation of time before requerying if there are progresses (useful when we'll implement caches of function calls and results, and when we'll have a farm of external servers, and computing resources to manage and throttle, but the evaluator can also have input parameters allowing it to select existing cached values if their completion level is better but the newest queries are still incomplete or not enough complete compared to what's in cache). verdy_p (talk) 22:08, 25 June 2021 (UTC)Reply
Finally the estimators may not be fully algorithmic, but may involve humans (e.g. organizing votes, or doing translation validation, or reviewing and correcting some content, or auditing a violation of rights, or looking for references): time to get the resposnse will depend on human activities, and estimators can then provide alternate replies with different completion levels. In summary, we get what is done in major web search engines capable of answering to complex queries, or in big databases computing/updating statistic cubes based on a large live database which continues to grow or being updated... verdy_p (talk) 22:18, 25 June 2021 (UTC)Reply
Also metadata returned for results could as well accumate tracking info, notably sources/references and licencing requirements: some queries should fail (refuse to combine and compute the inputs if they are incompatible, or reduce the input set to provide partial results) if these requirements are not met (requirements may include privacy options, authorizations tokens, security scopes...). verdy_p (talk) 22:28, 25 June 2021 (UTC)Reply
@Verdy p: I think some of these things will be possible with Wikifunctions out-of-the-boxish, others will require an external service to implement and keep state, and can use Wikifunctions functions to calculate certain values. For example, returning an iterator in the form of a calculated head and a tail that is actually a thunk that in turn calculates the rest - yes, I absolutely think that we should have that!
We currently have phab:T282164 open which is exactly about what kind of metadata a function result should be coming with. Thoughts are welcome! -- DVrandecic (WMF) (talk) 21:24, 3 September 2021 (UTC)Reply
Return to "Abstract Wikipedia/Updates/2021-06-17" page.